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Abstract 

This  dissertation  addresses  the  problem  of  estimating  a 
vector —valued  stochastic  process  x  from  observations  of  a 
space-time  point  process  which  is  dependent  on  x  .  The 
observations  are  corrupted  by  statistically  independent, 
additive  point  process  noise.  The  research  is  motivated  by 
a  neutral  particle  beam  estimation  and  control  problem  in 
which  it  is  desired  to  estimate  the  position  of  the  beam 
from  detected  photo-electron  events.  Dark  current  in  the 
detector  and  other  photon  sources  comprise  the  noise 
sources. 

A  multiple  model  adaptive  estimator  is  developed  in 
which  the  separate  models  are  hypothesis  sequences.  The 
hypotheses  define  which  observed  events  were  due  to  the 
signal  process  and  which  were  due  to  the  noise  process.  The 
estimator  provides  the  minimum  mean  squared  error  estimate 
of  the  underlying  process.  The  problem  is  modeled  on  a 
cross  product  of  probability  spaces,  and  regularity 
conditions  are  defined  which  allow  calculation  of  the 
weighting  factors  for  the  multiple  model  estimator.  This 
modeling  concept  allows  feedback  from  the  observed  events  to 
the  model,  thus  providing  a  means  for  control  of  the 


x 


process.  The  multiple  model  adaptive  estimator  and  the 
cross  product  modeling  concepts  are  valid  for  a  general 
point  process  signal  in  point  process  noise  as  long  as  the 
regularity  conditions  are  met.  The  number  of  elemental 
filters  in  the  estimator  doubles  as  each  new  point  process 
event  is  observed. 

For  the  particle  beam  application,  the  elemental 
filters  are  Snyder-Fishman  "firefly"  filters,  in  which  the 
signal  process  is  assumed  Poisson  conditioned  on  the 
underlying  process. 

Simplifications  to  the  full  scale  estimator  are 
proposed  which  result  in  a  fixed  number  of  elemental 
filters.  This  is  accomplished  by  considering  only  data 
within  a  fixed  window.  The  data  windowing  is  applicable  to 
the  general  point  process  estimation  problem. 
Simplifications  which  reduce  the  complexity  of  the  multiple 
model  weighting  factor  calculations  are  developed  for  the 
particle  beam  application.  The  simplifications  result  in  a 
subopt imal  estimator. 

Monte  Carlo  simulations  of  the  subopt imal  estimator 
demonstrate  that  it  is  extremely  successful  at  rejecting 
point  process  noise  events  in  the  measurement  history,  even 
at  signal  to  noise  count  ratios  as  low  as  0.1  and  very  low 


data  rates. 


MULTIPLE  MODEL  ADAPTIVE  ESTIMATION 
FOR  SPACE-TIME  POINT  PROCESS  OBSERVATIONS 


I.  Introduction 


1.1  The  Problem 

The  problem  addressed  by  this  research  is  one  of 
estimating  parameters  of  an  underlying  stochastic  process 
from  observations  of  a  point  process  where  the  point  process 
is  dependent  on  the  underlying  process  and  the  observations 
are  corrupted  by  point  process  noise.  A  second,  closely 
related  problem  is  that  of  allowing  feedback  control  for  a 
system  in  which  observations  of  a  point  process  signal  are 
corrupted  by  point  process  noise.  This  will  provide  a 
method  for  investigating  the  optimal  stochastic  adaptive 
controller  for  the  system. 

The  major  contribution  of  this  research  is  a  method  for 
developing  an  estimator  for  the  above  mentioned  point 
process  signal  in  point  process  noise  environment.  The 
method  allows  feedback  to  the  model  from  the  observations 
thus  providing  a  means  for  control.  This  method  is  used  to 


develop  the  estimator  for  the  neutral  particle  beam  pointing 
and  tracking  problem  which  motivated  this  research. 


1.1.1  Problem  Motivation.  This  research  is  motivated 
in  part  by  problems  in  neutral  particle  beam  pointing  and 
tracking  currently  being  investigated  at  the  Ballistic 
Missile  Defense  Advanced  Technology  Center,  Huntsville, 
Alabama  and  the  Air  Force  Weapons  Laboratory,  Kirtland  AFB, 
New  Mexico.  Their  goal  is  not  only  to  estimate  the  position 
and  direction  of  the  beam,  but  to  use  that  information  in  an 
optimal  way  to  control  the  pointing  of  the  beam. 

A  method  for  sensing  the  location  of  the  neutral 
particle  beam  has  been  proposed  in  which  the  beam  is 
illuminated  by  one  or  more  lasers.  At  certain  angles  of 
intersection  and  particle  velocities,  the  particle  electrons 
absorb  photons  from  the  laser  beam  and  attain  a  higher 
energy  state.  The  electrons  spontaneously  decay  to  their 
ground  energy  states  and  radiate  photons  in  an  approximately 
isotropic  manner.  By  detecting  this  resonant  scattered 
light  energy,  the  position  of  the  beam  can  be  inferred. 

The  signals  resulting  from  detection  of  optical  fields 
can  be  modeled  by  conditional  Poisson  (CP)  random  processes 
(Refs.  25,28,33,46).  The  statistics  are  conditioned  on  the 
rate  parameter  of  the  Poisson  process,  which  is  proportional 
to  the  intensity  of  the  received  optical  envelope.  Real 
optical  detectors  typically  have  a  noise  mechanism  which  is 
independent  of  the  signal  and  can  be  modeled  by  another 
conditional  Poisson  process.  There  is  a  non-zero 
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probability  that  electrons  will  be  emitted  from  a 
photodetector  even  in  the  absence  of  incident  photons.  The 
resulting  current  is  called  dark  current.  In  general,  there 
will  also  be  noise  induced  by  background  light  sources  in 
the  field  of  view  of  the  detector.  If  statistical 
independence  is  assumed  between  the  three  processes,  the 
resulting  process  is  also  CP  with  a  rate  parameter  which  is 
the  sum  of  the  three  individual  rate  parameters.  This  is 
shown  in  Chapter  II. 

The  conditional  Poisson  process  model  is  required  when 
the  level  of  the  received  signal  is  so  low  that  individual 
photo-electron  events  must  be  counted.  At  higher  signal 
rates,  the  observed  current  might  be  adequately  modeled  by  a 
Gaussian  process  as  is  done  in  many  communication  and 
control  type  problems.  Only  the  point  process  signal  in 
point  process  noise  case  is  considered  in  this  research. 

The  problem  then  is  to  estimate  the  position  of  the 
neutral  particle  beam  from  observations  of  a  conditional 
Poisson  process.  The  observed  CP  process  is  composed  of  the 
signal  (scattered  resonant  photons  from  the  illuminated 
electrons)  and  noise  sources.  An  associated  problem  to  be 
considered  subsequently  is  control  of  the  pointing  of  the 
beam. 

The  observations  considered  here  are  of  a  space-time 
point  process  which,  conditioned  on  the  rate  parameter,  is 
Poisson.  The  rate  parameter  itself  is  a  stochastic 
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process  and  we  desire  to  estimate  some  function  of  the  rate 
parameter.  In  this  dissertation,  the  term  space-time  point 
process  is  used  to  denote  a  vector  valued  random  process 
which  is  a  mapping  from  the  cross  product  of  a  time  interval 
and  a  probability  space  into  [to,T)  X  Rm  (time  cross 

real  Euclidean  m  space).  Each  observation  is  of  the  form 

j. 

(t.,r.)  where  t^  is  the  time  of  occurrence  for  the  i 
observation  and  is  the  location  in  real  Euclidean  m 

space  for  the  i*'*1  observation. 

The  observed  conditionally  Poisson  (CP)  space-time 
process  is  composed  of  a  CP  process  of  interest  (the  signal) 
plus  a  CP  noise  process.  The  noise  and  signal  processes  are 
assumed  to  be  statistically  independent,  resulting  in  the 
observed  CP  process. 

A  second  sensing  mechanism  which  may  be  exploitable  for 
pointing  and  tracking  in  the  neutral  particle  beam  project 
is  induced  gamma  radiation.  When  the  beam  strikes  a  target, 
gamma  radiation  is  produced.  This  can  be  observed  by  an 
estimator/controller  and  used  to  direct  the  beam.  When  the 
signal  rates  are  low  so  that  point  process  statistics  must 
be  used  to  model  the  system  adequately,  the  results  of  this 
research  can  be  used  for  estimation  and  control. 

Other  possible  applications  for  these  techniques 
include  tracking  of  missiles  or  satellites  where  the 
observed  signal  rate  is  low,  necessitating  the  use  of  point 
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process  models.  The  results  presented  here  are  not 
dependent  on  active  illumination  of  the  target  so  a  passive 
tracking  system  with  low  observed  signal  rates  could  also 
fit  the  model. 

1.1.2  Key  Concepts.  The  key  concepts  in  this  research 

are : 

1.  A  space-time  point  process  signal  is  observed  in 
space-time  point  process  noise. 

2.  We  are  interested  in  estimating  some  vector  which 
parameterizes  the  signal  process. 

3.  We  want  to  allow  feedback  from  the  observations  in 
the  model  in  order  to  provide  a  means  of  control. 

For  the  neutral  particle  beam  problem: 

4.  Both  the  signal  and  noise  processes  are  modeled  as 
Poisson  processes,  conditioned  on  knowledge  of  the 
respective  (perhaps  random)  rate  parameters. 

5.  The  signal  and  noise  processes  are  assumed 
statistically  independent. 


1.2  Background  Literature 

The  literature  applicable  to  this  research  can  be 
divided  into  several  overlapping  categories.  Each  category 
addresses  a  portion  of  the  beam  estimation  and  control 
problem.  The  categories  are:  Poisson  process  estimation, 
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jump  processes,  space-time  point  process  estimation  and 
control,  and  multiple  model  adaptive  estimation  and  control 
methods. 

1.2.1  Poisson  Process  Estimation.  References  13,25, 
33,34,  and  46  contain  several  examples  of  estimation  for 
processes  modeled  by  Poisson  statistics.  Snyder  (Ref.  46), 
in  particular,  provides  the  requirements  for  modeling  a 
point  process  with  Poisson  statistics  and  presents  many 
useful  probability  densities  and  distributions  for  Poisson 
processes. 

Most  of  the  examples  in  these  references  are  oriented 
towards  communication  type  problems.  In  these  examples,  a 
time  sequence  of  point  events  is  observed  and  the  rate 
parameter  of  the  process  is  estimated.  A  second  similar 
problem  is  that  of  estimating  the  presence  of  an  on-off 
keyed  signal  in  noise.  These  communication  problems 
typically  do  not  include  any  spatial  observations  of  the 
process;  however,  the  forms  of  the  probability  densities 
are  analogous  to  those  developed  in  this  research  for  space- 
time  Poisson  process. 

1.2.2  Jump  Processes.  One  method  of  including  the 
spatial  nature  of  the  observed  process  is  to  model  it  as  a 
jump  process  (Refs.  8,41,42,49,50).  In  general,  a  jump 
process  is  one  in  which  point  events  occur  randomly  in  time 
and  there  is  a  value  or  weight  associated  with  each  observed 
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event.  An  example  of  a  system  which  might  be  modeled  by  a 
jump  process  is  urban  vehicle  traffic  in  which  sensors 
measure  time  of  arrival  and  the  jump  value  might  be  speed  or 


direction.  For  the  problem  at  hand,  we  might  model  the 
system  as  a  jump  process  in  which  the  value  (or  weight)  of 
the  jump  is  the  m  dimensional  spatial  location  of  the 
observation.  Vaca  and  Tretter  (Ref.  49)  discuss  optimal 
estimation  for  the  traffic  example  but  no  noise  sources  are 
considered. 

Segall  and  Kailath  (Ref.  42)  consider  modeling  of 
randomly  modulated  jump  processes.  This  model  addresses  our 
goal  of  inferring  information  about  the  observed  signal 
point  process;  however,  they  approach  noise  as  either  an 
additive  white  Gaussian  source  or  as  an  additive  point 
source  in  a  binary  detection  type  of  problem.  Jump 
process  models  address  some  portions  of  our  point  process 
signal  in  point  process  noise  problem,  but  the  model  does 
not  fit  all  aspects  and  jump  processes  are  not  considered 
further  in  this  research. 

1.2.3  Space-Time  Point  Processes.  The  modeling  of  a 
system  as  a  space-time  point  process  (Refs. 
11,12,35,38,45,46,47)  is  very  applicable  to  this  problem. 
In  particular,  the  basic  definitions  and  tools  for 
statistical  inference  for  space-time  point  processes  are 
developed  in  Fishman  (Ref.  11)  and  Fishman  and  Snyder  (Ref. 
12).  Each  observation  of  a  space-time  point  process 
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consists  of  a 


time  of  event  occurrence  and  a  spatial 

coordinate  of  the  event.  In  this  research,  the  spatial 

coordinate  is  an  element  of  real  Euclidean  m  space  and  the 
t  h 

il  measurement  consists  of  the  pair 

(ti,ri)e[t o ,T)  X  Rm 

where 

t^  is  the  time  of  occurrence 

r^  is  the  spatial  vector  of  the  event 

[to,T)  is  the  time  interval  of  the  observations 
Rra  is  real  Euclidean  m  space. 

If  we  let  be  the  number  of  observed  point  process 

events  in  the  interval  [t0,T)  ,  then  the  measurement 

history  can  be  defined  as 

Nt  A 

Z  =  ( ( 1 1  ,n),  ...  ,  (tN  ,rN  )} 

t  t 

Snyder  and  Fishman  (Ref.  47)  present  a  conceptually 
pleasing  motivation  for  this  model  and  develop  the 
associated  estimator  for  the  case  when  no  noise  is  present 
in  the  observations,.  They  pose  the  problem  as  one  of 
tracking  the  centroid  of  a  swarm  of  fireflies.  The  swarm  is 
assumed  to  have  a  Gaussian  shaped  density  in  real  Euclidean 


m  space  and  its  centroid  is  assumed  to  move  in  real 
Euclidean  m  space  as  a  linear  function  of  the  output  of  a 
linear  n  dimensional  dynamical  system  driven  by  a  standard 
Wiener  process.  The  n  dimensional  output  is  Markov-1,  but 
the  motion  of  the  centroid  is  not  necessarily  Markov-1.  The 
observer  is  allowed  to  view  the  flashes  of  the  fireflies  and 
the  measurements  consist  of  the  flash  occurrence  times  and 
locations  in  m  space.  Given  the  centroid  of  the  Gaussian 
shaped  swarm,  the  flashes  are  assumed  to  occur  as  a  Poisson 
process  and  the  Gaussian  shaped  swarm  corresponds  to  the 
rate  parameter  of  the  space-time  conditionally  Poisson 
process.  They  show  that  the  estimate  of  the  centroid  is 
Gaussian  and  the  structure  of  the  estimator  is  analogous  to 
a  Kalman  filter.  There  are  propagation  and  update  phases  of 
the  estimator  and  there  is  a  residual  term  in  the  estimator 
structure  similar  to  that  of  the  Kalman  filter.  The  updates 
occur  at  the  event  times  rather  than  at  some  a  priori  chosen 
sample  times.  Although  this  model  and  estimator  included  no 
noise  sources,  it  forms  a  basis  for  this  research. 

Snyder,  Rhodes,  and  Hoversten  (Ref.  48)  extended  the 
usefulness  of  this  model  and  estimator  with  the 
demonstration  of  a  separation  theorem.  They  showed  that, 
for  the  "firefly"  tracking  problem,  the  optimum  stochastic 
controller  is  decomposable  into  an  independently  designed 
estimator  and  the  linear  deterministic  optimal  controller. 
The  estimator  is  the  Snyder  and  Fishman  filter  (Ref.  47)  and 
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the  linear  control  law  is  the  deterministic  result  obtained 
if  the  output  of  the  n  dimensional  dynamical  system  were 
known  exactly. 
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This  result  is  important  since  the  goal  of  virtually 
any  estimator  used  for  tracking  is  to  control  some  system  to 
maintain  track.  The  separation  theorem  provides  the 
synthesis  method  for  attaining  this  control.  Secondly,  the 
separation  theorem  provides  a  simple  form  for  the  optimal 
controller  and  can  provide  insight  into  a  possible 
separation  theorem  for  the  case  of  point  process  signal 
observed  in  point  process  noise.  If  a  separation  theorem  is 
not  possible,  this  result  may  still  provide  ght  for 
using  forced  certainty  equivalence  (Ref.  27  vol.  111:17)  to 
generate  a  controller. 

Santiago  (Ref.  38)  investigated  limitations  of  optical 
trackers,  including  the  effects  of  noise  on  the  "firefly" 
estimator.  He  performed  simulations  on  the  estimator  both 
with  and  without  unmodeled  point  process  noise  corruption  of 
the  data.  As  might  be  expected,  the  estimator's  performance 
degraded  significantly  when  the  noise  was  present,  since  the 
noise  was  unmodeled.  Santiago  developed  some  ad  hoc  methods 
to  reduce  the  influence  of  the  noise.  One  method,  motivated 
by  residual  monitoring  techniques  in  Kalman  filter 
applications,  was  to  ignore  measurements  which  resulted  in  a 
residual  magnitude  above  some  predetermined  threshold. 
Simulations  showed  a  significant  improvement  in  the 
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estimator's  performance  with  these  methods  of  dealing  with 
the  noise,  and  his  results  suggest  that  a  proper  theoretical 
development  of  an  estimator  with  modeled  noise  could  be 
successful.  Some  form  of  disregarding  or  deweighting  of 
events  suspected  of  being  caused  by  noise  might  be  a  useful 
course  of  action  in  developing  an  estimator  for  this  signal- 
in-noise  environment. 

1.2.4  Decision  Theory  and  Multiple  Model  Estimators. 
The  two  topics  included  in  this  literature  category  concern 
methods  for  disregarding  or  deweighting  measurements  which 
contain  little  or  no  information  about  the  process  of 
interest.  Investigation  of  these  topics  is  motivated  by  the 
results  of  Santiago. 

Binary  decision  theory  methods  are  presented  by 
references  4,20,21,22,39,  and  44.  Lainiotis  (Refs. 
20,21,22)  discusses  algorithms  for  adaptive  estimation  of 
both  the  system’s  structure  and  parameters  via  decisions  on 
a  binary  hypothesis  model  as  each  measurement  is  taken. 
Athans,  Whiting,  and  Gruber  (Ref.  4)  discuss  the  general 
binary  hypothesis  decision  theory  for  estimators,  two 
methods  of  incorporating  the  weighted  data,  and  a  specific 
linear  Gaussian  model  example. 

In  all  six  of  these  decision  theory  papers,  Bayesian 
statistics,  a  priori  knowledge  of  the  hypotheses,  and  the 
measurement  history  are  used  to  evaluate  the  validity  of  the 
most  recent  measurement.  Emphasis  is  placed  on 
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incorporating  or  rejecting  the  latest  measurement  when 
received  and  then  considering  it  no  further  (except  for  the 
implicit  effect  each  measurement  has  on  the  estimate  at 
subsequent  time).  The  goal  is  to  obtain  a  recursive 
algorithm  for  incorporation  of  the  most  recent  measurement 
in  order  to  minimize  memory  and  calculation  requirements. 

This  differs  from  the  concept  of  multiple  model 
adaptive  estimation  (MMAE)  and  multiple  model  adaptive 
control  (MMAC)  as  presented  in  references  2,3,5,9,19,26,27 
vols.  II  and  111,31,43,51,  and  52.  In  MMAE,  separate  model 
estimators  are  maintained  throughout  the  observation  time 
interval.  Magill  (Ref.  26)  and  Athans  and  Chang  (Ref.  2) 
describe  the  basic  MMAE  method.  As  in  most  of  the  MMAE /MMAC 
papers  cited,  the  problems  under  study  are  modeled  by  linear 
systems  driven  by  Gaussian  noises  and  the  separate  models 
are  chosen  by  selection  of  different  parameter  matrices  for 
the  model.  This  selection  of  models  can  be  made  either  by 
knowledge  that  only  a  finite  number  of  models  can  exist,  or 
more  commonly  by  discretization  of  the  range  of  the 
parameter  values.  A  set  of  estimators  ("bank  of  filters" 
which  are  Kalman  filters  in  the  linear  Gaussian  model  case) 
is  designed,  one  matched  to  each  distinct  model.  The 
estimates  from  each  filter  are  weighted  and  summed  to  obtain 
the  overall  estimate.  The  weights  are  determined  using 
Bayesian  statistics  and  any  a  priori  statistical  knowledge 
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of  the  models  in  a  manner  similar  to  the  binary  decision 
theory  estimators. 

Stability  and  convergence  of  MMAE/M11AC  algorithms  are 
discussed  in  references  6,14,15,16,17,  and  29.  Baram  (Ref. 
6)  presents  consistency  and  convergence  results  for  a  large 
class  of  maximum  a  posteriori  (MAP),  maximum  likelihood 
(ML),  least  squares  (LS),  and  Bayesian  estimators  through 
the  use  of  information  metrics.  Hawke s  and  Mcore  (Ref.  15) 
use  similar  methods  to  show  that  for  a  MMA  estimator  with  a 
finite  number  of  models,  the  weighting  coefficient  converges 
almost  surely  to  one  for  the  model  closest  to  the  true 
parameter. 

1.3  Research  Approach 

The  approach  taken  in  this  research  is  to  develop  an 
estimator  for  the  space-time  point  process  signal  plus  noise 
system  using  multiple  model  adaptive  estimation  techniques. 
For  the  particle  beam  application,  Snyder  and  Fishman 
"firefly"  filters  are  used  for  the  individual  filters  in  the 
"bank"  (Ref.  47).  Each  assumed  model  (or  hypothesis)  is  a 
distinct  sequence  specifying  which  observed  events  are  noise 
and  which  observed  events  are  signal.  Once  a  model  is 
specified  by  the  hypothesis,  only  those  measurements  assumed 
caused  by  the  signal  process  are  considered  by  the 
individual  filter.  The  overall  estimate  is  a  weighted  sum 
of  the  individual  filters'  estimates  as  in  the  linear 


Gaussian  MMAE  examples.  The  estimator  is  developed  to  admit 


feedback  from  the  observations  to  the  model.  This  will 
allow  for  definition  of  an  optimal  controller  for  the 
system.  A  covariance  expression  for  the  estimator  is 
developed  and  methods  for  reducing  the  computational 
complexity  are  investigated. 


1.4  Summary  of  Remaining  Chapters 

In  Chapter  II,  the  detailed  signal-in-noise  model  for 
the  system  is  defined  and  conditions  for  using  Poisson 
statistics  are  specified.  A  brief  description  of  MMAE  is 
presented  and  the  structure  of  the  MMAE  for  this  point 
process  problem  is  developed.  Weighting  coefficients  are 
developed  using  a  probability  density  approach.  The 
expressions  for  the  coefficients  are  very  difficult  to 
compute,  thus  motivating  the  cross  product  space  modeling 
concepts  presented  in  Chapter  III. 

In  Chapter  III,  a  description  of  some  of  Fishman's 
statistical  inference  results  for  doubly  stochastic  space- 
time  point  processes  is  given,  including  a  regularity 
definition  and  the  implications  of  regularity  in  a  point 
process.  The  beam  problem  is  cast  as  a  doubly  stochastic 
space-time  point  process  and  an  analytic  cross  product  space 
model  is  developed  for  the  problem.  A  regularity  proof  is 
given  for  this  model. 


In  Chapter  IV,  the  cross  product  space  model  is  used  to 
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develop  the  weighting  factors  for  the  multiple  model 
adaptive  estimator.  The  complete  equations  for  the  MMA 
estimator  for  the  beam  problem  are  presented  and  some 
example  cases  are  presented. 

The  full  scale  estimator  requires  an  exponentially 
growing  amount  of  calculation  and  memory.  Methods  to 
simplify  the  estimator  are  presented  in  Chapter  V.  These 
methods  result  in  suboptimal  estimators. 

In  Chapter  VI,  results  of  Monte  Carlo  simulations  are 
presented.  The  simulations  are  based  on  the  suboptimal 
filter  simplifications  of  Chapter  V. 

Conclusions  and  recommendations  are  presented  in 


Chapter  VII. 


II .  Multiple  Model  Adaptive  Estimation 


II. 1  Introduction 

In  this  chapter,  the  basic  models  for  the  signal  and 
noise  processes  are  presented  and  multiple  model  adaptive 
estimation  for  sequence  hypotheses  is  developed.  In  Section 
II. 2,  the  basic  models  for  signal  and  noise  are  presented. 
The  signal  model  (with  no  noise  sources  present)  is  the  same 
as  used  by  Snyder  and  Fishman  in  their  filter  development 
(Ref.  47).  Their  results  are  presented  in  Section  1 1. 3.  In 
Section  II. 4,  the  concept  of  multiple  model  adaptive 
estimation  (MMAE)  is  motivated  in  order  to  deal  with  the 
noise,  and  the  general  MMAE  structure  is  developed.  The 
MMAE  concept  is  then  applied  to  the  general  point  process 
problem  by  considering  each  model  to  be  described  by  a 
distinct  hypothesis  sequence  that  defines  which  observed 
events  are  due  to  noise  and  which  are  due  to  signal. 
Finally,  the  explicit  MMAE  filter  for  the  particle  beam 
problem  is  presented  in  which  the  a  posteriori  statistics 
are  developed  from  a  probability  density  point  of  view.  In 
general,  these  results  are  difficult  to  compute,  thus 
motivating  Chapter  III. 


I I. 2  The  Model 


II. 2.1  Signal.  In  this  research,  the  signal  source  is 
the  excited  volume  of  neutral  beam  particles.  The 
spontaneous  decay  of  electrons  results  in  emission  of 
photons  which  may  be  observed  by  an  array  of  photodetectors. 
We  wish  to  determine  the  position  of  the  beam  from  the 
observed  photo-electron  events. 

As  in  Snyder  and  Fishman  (Ref.  47),  the  signal  is 
modeled  as  a  space-time  point  process  on  [to,05)  X  Rm 
Each  observation  (photon  detection)  has  associated  with  it  a 
time  of  occurrence  te[to,«>)  and  spatial  location  reRm  .  A 
physical  detector  array  will  result  in  a  quantization  of  Rrn 
into  a  finite  number  of  possible  points.  This  quantization 
and  any  resulting  effects  are  not  addressed  in  this 
research;  the  spatial  measurements  are  allowed  to  assume 
any  value  in  Rm  (or  perhaps  some  properly  defined  subspace 
of  Rm  ) . 


Let  T 

and 

A  be  Borel 

sets 

in  [to,00) 

and  Rm 

respectively 

and 

let  N(T  X  A) 

be  the 

number  of 

observed 

point  events  in  T  X  A  .  The  number  of  observed  events  up  to 
time  t  (regardless  of  spatial  location)  is  defined  as 


N  =  N( [t  o , t  )  X  R 


m, 


(1) 


The  measurement  history  over  the  interval  [t0,t)  consi^ 


of  a  sequence  of  pairs 


where  the  under  tilde  denotes  a  random  process,  A(t)  is  a 
known  amplitude  of  the  density  function,  H(t)  is  a  known  m 
by  n  projection  matrix,  R(t)  is  a  known  symmetric 
positive  definite  matrix  for  all  t£[to,t^)  ,  and  the  »(t) 
indicates  dependence  on  time.  In  Chapter  III,  a  modeling 
method  is  developed  which  allows  these  "known"  parameters  to 
be  random;  however,  the  rate  parameter  expression  given  in 
equation  (3)  will  be  used  for  the  motivating  particle  beam 
problem.  The  vector  x(t)  is  an  n  dimensional  Gaussian 
output  of  a  linear  stochastic  differential  equation 


dx(t)  =  F(t)x(t)dt  +  G(t)du(t) 


(4) 

—  V  —  > 

x(to)  =  X0  t  -  to 
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where  F(t)  is  an  n  by  n  dimensional  known  matrix  function 
of  time,  G(t)  is  an  n  by  k  dimensional  matrix  function  of 
time,  u(t)  is  a  standard  k  dimensional  Wiener  process  of 
unit  diffusion,  and  xo  is  a  Gaussian  random  variable  with 
covariance  £,o  and  mean  xo  .  This  definition  is  expanded 
in  Section  III. 5  to  include  feedback  control. 

This  form  of  x(t)  is  useful  because  it  is 
descriptive  of  a  large  class  of  estimation  and  control 
0*  problems,  it  is  flexible,  and  it  results  in  an  estimator 

(described  in  Section  1 1. 3)  which  is  analogous  to  a  Kalraan 
filter  in  several  ways.  Neither  the  Gaussian  shaped  signal 
rate  parameter  nor  the  linear  dynamical  form  of  x(t)  are 
required  for  the  multiple  model  nature  of  the  full  scale 
estimator.  We  could  consider  a  more  general  form  of  the 
signal  rate  parameter  and,  perhaps,  a  non-linear  stochastic 
equation  to  define  x(t)  .  With  the  appropriate  elemental 
estimator  for  this  more  general  model,  we  can  still  use 
multiple  model  adaptive  estimation  concepts  for  the  full 
scale  estimator. 

Surfaces  of  constant  particle  density  form  ellipsoids 
in  Rm  and  the  centroid  of  the  ellipsoids  is  H(t)x(t) 
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The  shape  and  size  of  the  ellipsoids  can  change  with  time  in 
a  deterministic  manner  and  the  centroid  moves  as  a  Gaussian 
process . 

We  assume  that : 


lim 
t  ,  p4-0 


Pr[N( [t , t+T )  X  c(r,p))=l| Bt,x(a) ;a-t0] 


lim  Pr[N(Ct,t+'0  X  c(r,p)£l| 8t ,x(a) ;a-t0] 

T.p-K)  Tp1 


.m 


(5) 


=  As(t,r,x(t)) 

where  is  the  sub  sigma  algebra  of  events  up  to  time  t, 
and  where  cCr.p")  =  [r i ,r j+p i )*. . .x[rm, r  +p  )  is  a 
volume  in  Rm .  As  a  result,  given  the  process  x(t)  ,  the 
point  events  occur  in  [to,00)  X  Rm  as  a  conditional  space- 
time  Poisson  point  process.  The  particle  density 
A  (t,r,x(t))  defined  in  equation  (3)  is  the  rate  parameter 
of  this  conditional  Poisson  process. 

Note  that  in  the  current  description  of  the  signal 
model,  R(t)  ,  A(t )  ,  and  H(t)  are  time  varying 
deterministic  quantities.  The  model  is  expanded  to  admit 
randomness  in  each  of  these  in  Chapter  III. 


2 


r 


II. 2. 2  Noise.  We  let  the  noise  be  modeled  by  a  space- 
time  point  process  on  [to,00)  X  Rm  and  assume  that 


lim 
x ,  p+0 


Pr[N( [t , t+x )  X  c(r, p) )=1 | Bt] 


(6) 


(t,r) 


so  that  noise  induced  photo-electron  events  occur  as  a 
space-time  Poisson  process  on  [to,00)  X  Rm  with  rate 
parameter  An(t,r)  .  If  we  allow  the  rate  parameter  to 
depend  on  some  random  process  0  ,  then  Xn(t,r)  is  a 
random  process  and  the  noise  events  occur  as  a  Poisson 
space-time  point  process  conditioned  on  knowledge  of  0  . 

The  noise  process  is  assumed  to  be  statistically 
independent  of  the  signal  process  and  additive.  At  this 
point,  there  is  no  requirement  to  define  further. 
We  can  develop  the  MMAE  concepts  with  the  current  general 
description  of  £n(t,r)  .  In  Chapter  III,  additional 
restrictions  will  be  placed  on  Xn(t,r) 


that 


and 


If 


\n(T  ,-Od Adx<« 


to  Y 


(7) 


where  Y  is  Rm  or  a  subspace  of  Rm  to  which  we  restrict 
the  estimator.  This  constraint  is  necessary  for  the 
regularity  proof  in  Chapter  III  and  it  is  not  physically 
very  restrictive.  The  condition  does  imply  that  the 
observation  of  a  noise  induced  event  is  not 
(probabilistically)  certain.  For  example,  the  subspace  Y 
could  correspond  to  a  finite  two  dimensional  array  of 
photodetectors  and  the  noise  rate  parameter  could  be  a 
constant . 

The  observed  point  process  is  composed  of  the  sura  of 
the  signal  and  the  noise  point  processes.  Since  the  signal 
and  noise  processes  are  independent,  the  probability  density 
function  of  their  sum  is  the  convolution  of  the  individual 
probability  density  functions  (Ref.  33:189),  and  the 
characteristic  function  of  the  sum  is  the  product  of  the 
individual  characteristic  functions  (Ref.  33:159).  The 
characteristic  function  for  the  signal  process  (conditioned 
on  x(t))  is 
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and  the  characteristic  function  for  the  noise  process 


(conditioned  on  knowledge  of  any  uncertainties  in  Xfl  )  is 

exp[Xu(t,r  )  (e^-l)] 

where,  for  these  two  expressions  only,  j  =  /^T  and  oj 

is  the  frequency  domain  variable;  this  notation  is  used 
here  to  be  consistent  with  the  notation  of  Papoulis  (Ref. 
33).  The  product  of  these  two  conditional  characteristic 
functions  is  the  characteristic  function  of  the  sum  of  the 
processes.  The  form  of  the  product  is  that  of  a  conditional 
Poisson  point  process  with  rate  parameter 


X(t,r,x(t))  =  Xs(t,r,x(t))  +  An(t,r) 


(8) 


II. 3  Snyder  and  Fishman  Filter 

Snyder  and  Fishman  (Ref.  47)  present  an  estimator  for 
the  vector  x(t)  (when  no  noise  is  present)  as 


CM 


Nt  r  _ 

x(t)  =  E{x(t )  |  Z  l}=  I  S(t)p-(t)|ZNt  (C(t)|Zr<t)dC  (9) 


where  p(£|ZNt)  is  the  conditional  probability  density 
function  of  x(t)  given  the  measurement  history  Z  x 
The  estimator  is  developed  for  the  signal  model  defined  in 

the  previous  section  when  there  is  no  noise:  X  (t)  =  0 

n 

The  estimator  is  presented  in  differential  form  as 


dx(t)  =  F(t)x(t)+  /  K(t)[r-H(t)x(t)]N(dt  X  dr) 


(10) 


d£(t)  =  F(t)£(t)dt+£(t)FT(t)dt+G(t)GT(t)dt 


-  /  K(t)H(t)£(t)N(dt 
Rm  " 


X  dr) 


(11) 


K(t)  =  Z(t)Hi(t)[H(t)Z(t)II1(t)+R(t)]' 


(12) 


x(  t  o  )  =  X( 


2(t0)  »  Ic 


(13) 


where  /• N(dt  X  dr)  is  a  counting  integral  (Ref.  11).  They 
also  demonstrate  that  the  conditional  density  function 
P(x(t) |  Z  ^ )  is  Gaussian. 

In  the  expanded  model,  which  includes  an  independent 
noise  source,  if  we  knew  precisely  which  observed  events 
were  due  to  the  signal  process  and  which  were  due  to  noise, 
then  we  could,  trivially,  use  Snyder  and  Fishman's  filter 
and  only  consider  the  signal  observations.  We  don't  know 
v/hich  observed  events  are  noise,  but  we  can  use  a  noise 
rejection  idea  througn  multiple  model  adaptive  estimation 
techniques. 


II. 4  Multiple  Model  Adaptive  Estimation 


In  this  section,  multiple  model  adaptive  estimation  is 
presented  in  general  terms  and  then  the  specific  MMAE 
equations  are  developed  for  the  point  process  signal  plus 
noise  problem. 

I I. 4.1  General  MMAE.  Let  us  suppose  that  we  desire  to 
estimate  the  value  of  some  quantity  x(t)  which  is  a  random 
process.  If  we  use  the  minimum  mean  square  error  criterion 


1 


of  optimality,  we  can  define  the  optimal  estimate  as  the 

N 

expected  value  of  x(t)  given  the  measurement  history  Z  t 


x(t)  =  E(x(t) |ZNt} 


J r(t)P(c(zN^)dc 


(14) 


where  £(t)  is  the  dummy  variable  of  integration  for  x(t) 
—  N 

and  p(£(t)|Z  t)  is  the  conditional  probability  density 
function  of  x(t)  given  the  measurement  history  ZNt 
(Note  that  the  subscripts  on  the  probability  density 
function  have  been  dropped  for  simplicity  of  notation.  This 
convention  is  used  in  the  rest  of  this  dissertation  unless 
the  subscripts  are  needed  for  clarity.)  We  use  the  same 
notation  to  describe  the  measurement  history  here  as  we  did 
for  the  point  process  model  description  in  Section  II. 2  in 
order  to  maintain  continuity  of  notation.  For  this  general 
MMAE  development,  the  measurement  history  zN^  is  whatever 
measurement  is  appropriate  to  the  physical  problem  and  model 
under  consideration.  There  is  no  implication  that  the 
general  MMAE  method  is  restricted  to  space-time  point 
processes.  The  integration  in  equation  (14)  is  over  the 
domain  of  x(t)  and  we  assume  that  the  probability  density 
function  exists  for  the  Riemann  integral  to  have  meaning. 


i  m 


Further  suppose  that  we  do  not  know  how  to  model  the 
process  x(t)  exactly  to  obtain  the  expression  for 
p(^(t)|Z^t)  but  that  we  know  that  the  correct  model  is  one 
of  a  finite  set  of  possible  models.  (The  restriction  to  a 
finite  set  of  possible  models  is  not  necessary  for  MMAE  in 
general.  In  this  point  process  application,  however,  it 
will  be  natural  to  accept  this  type  of  model  restriction. 
That  course  is  taken  in  this  development.  Athans  and  Chang 
(Ref. 2)  consider  linear  Gaussian  models  which  are 
discretized  to  a  finite  set  from  a  possible  continuum  of 
models . ) 

Let  there  be  J+l  possible  models  to  represent  the 
process  x(t)  ,  where  each  model  is  represented  by  h^  t 
j  e  0,1, 2, 3,  ...,  J  .  Let 


h.  e  H 

J 


(15) 


where  H  is  an  appropriately  defined  space.  In  an  example 
where  the  different  models,  or  hypotheses,  are  represented 
by  real  matrices,  the  space  H  could  be  a  sufficiently 
dimensioned  real  Euclidean  space. 

From  equation  (14),  our  definition  for  the  optimal 
estimator  is 
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:(t)  =  J  5(t)p(?(t)jZi,t)d£; 


(16) 


The  conditional  probability  density  function  also  depends  on 
the  model  h.  so  we  can  obtain  the  marginal  density  function 
from  the  joint  density  function  by 


x(t) 


=  Jt(  t)/i 

J  II 


P(^(t)>|zNt)dlidr, 


(17) 


where  h  is  the  dummy  variable  for  h.eH  .  By  Bayes'  rule 

3 


x(t ) 


=  fzwf p(?(t)|h,zNt)P(^|zNt)dad^ 


(18) 


Because  we  have  limited  the  models  to  a  finite  set,  the 
probability  density  function  p(fi|ZNt)  is  a  discrete 
density  of  the  form 
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(19) 


J 

p(fe|Z^)  =  ^  Pr[hj  correct  I  Z^]6(/i-hj ) 

3=0 


where  <5(*)  is  the  Dirac  delta  function  and  Pr[»]  denotes 
probability.  We  can  substitute  equation  (19)  into  equation 
(18)  to  get 


x(t) 


J 

p(r(t)|k,ZNt)  £  Pr[hj|ZNt]6(^i-hj)d^dr 
j=o 


(20) 


and  by  the  sifting  property  of  the  delta  function 


x(t) 


-/««  2  p(C(t)|hJ,ZNt)Pr[hj  |zNt]dC 
3=0 


(21) 


The  interchange  of  the  integration  and  summation  from 
equation  (20)  to  equation  (21)  is  justified  by  the  fact  that 
the  sum  is  finite.  By  changing  the  order  of  the 
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integration  and  summation  again,  we  obtain 


x(t)  =  Pr[hj|ZNt]y*C(t)p(C(t)|h.,ZNt)d5  (22) 


^  Pr[hj |ZNt]E{x(t) |h.. ,ZWt} 
j=0 


(23) 


Pr[h.  |  ZNt]x.(t) 

J  J 

j~0 


(24) 


In  equation  (24),  x,(t)  is  the  estimate  of  x(t) 

J  ^ 

conditioned  on  the  measurement  history  and  the  specific 
model.  The  various  densities  in  the  above  development  are 
assumed  to  exist,  although  a  parallel  development  can  be 
made  using  probability  distribution  or  measure  theory 
notation.  The  overall  structure  of  the  estimator  is  shown 
in  Figure  1. 

We  also  desire  to  find  the  covariance  for  the  multiple 


a 
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model  adaptive  estimator.  This  is  useful  as  a  measure  of 
the  estimator's  performance,  although  it  is  not  necessary  to 
calculate  it  for  the  online  estimator.  The  covariance  for 
the  full  scale  estimator  is  defined  as 


£(t)  ^  E{(x(t)  -  x(t»(x(t)  -  x(t))T|ZNt} 


(25) 


We  define  the  covariance  of  the  individual  estimators  in  the 
"bank"  as 


£j(t)  =  E 


(x(t) 


Xj(t))(x(t)  -  tj(t))T|hJ,ZNt 


(26) 


The  covariance  of  the  multiple  model  adaptive  estimator,  in 
terms  of  the  individual  covariances  and  estimates,  is  (Ref. 
2:30  and  7:420) 

J 

£(t)  -  ^  PrChj!zNt][£j(t)+[t,(t)-$(t)][xJ(t)-x(t)]T](27) 
j=0 


Examples  of  MMAE  for  linear  systems  driven  by  white 
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Gaussian  noise  are  given  in  references  2,7,26  and  27  vol. 
II.  For  these  models,  the  individual  estimators  are  Kalman 
filters,  each  one  tuned  to  match  the  associated  model 
hypothesis.  The  weighting  terms,  Prth.jz^t]  ,  can  be 

J 

calculated  recursively  (Ref.  2:33).  Except  for  the  fact 
that  J+l  Kalman  filters  must  be  operated  simultaneously,  the 
filter  structure  is  computationally  reasonable  via 
distributed  processing.  The  requirements  for  memory  and 
calculation  do  not  expand  as  each  measurement  is  made. 

Note  that  this  development  has  assumed  that  only  one 
hypothesis  is  correct  over  the  observation  interval.  If  the 
system  is  modeled  such  that  it  is  allowed  to  switch  from  one 
hypothesis  to  another  between  measurement  times,  then  we 
must  calculate  the  a  posteriori  probabilities  based  on 
histories  of  hypotheses.  This  leads  to  an  exponentially 
expanding  number  of  filters  in  the  "bank"  (Ref.  7).  If  the 
switching  is  allowed  to  occur  as  a  Markov-1  process,  then 
the  number  of  filters  expands  as  the  square  of  the  number  of 
possible  states.  The  expanding  number  of  hypotheses  and  the 
resulting  expanding  requirements  for  memory  and  calculation 
characterize  multiple  model  adaptive  estimation  for  the 
point  process  problem  under  consideration. 

II. 4. 2  MMAE  for  Point  Process  Signal  in  Noise.  With 
the  signal  and  noise  models  described  in  the  first  section 
of  this  chapter,  the  observed  process  is  conditionally 


« 


Poisson  and  the  observations  consist  of  a  sequence  of  pairs, 
time  of  occurrence  and  spatial  location,  as  shown  in 
equation  (2).  Each  observed  point  event  (  (t,r)  pair)  is 
either  due  to  the  underlying  signal  process  or  to  the 
underlying  noise  process.  We  can  use  this  concept  of  a 
binary  decision  at  each  observed  point  event  to  construct 
all  of  the  possible  sequences  of  noise/signal  events  which 
could  have  produced  the  observed  sequence. 

For  example,  at  time  t0,  is  zero;  no  events  have 
been  observed.  When  the  first  point  event  is  observed  at 
time  tj  and  location  ri,  we  have  a  measurement  sequence 
consisting  of  one  data  point  for  the  observation  interval 
[t0,t),tx<t  .  This  observed  event  could  have  been  caused 
by  either  the  signal  process  or  the  noise  process.  We  can 
represent  the  possible  hypotheses  with  a  tree  diagram  as  in 


Figure  2. 

A  hypothesis  sequence  is  denoted  as  h . t  where  the 
subscript  je{0,l,  ...  ,(2^)-!}  denotes  which  particular 


sequence  is  identified  and  the  superscript  N,  is  the 

L 

number  of  data  points  observed  up  to  the  time  when  the 


sequence  h.*  is  defined. 
3 

hjNt(i),  i=l,2 , 3,  ...  Nt 


When  an  argument  is  present  as  in 
,  we  refer  to  the  value  of  the 


sequence  at  time  t^  .  A  hypothesis  sequence  can  be  written 


as 
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Figure  2.  Hypothesis  Sequences  for  One  Measurement 


„  N,  N. 

hj t  -  {hjt(l),hjt(2),  . 


.  hd“(Nt>} 


(28) 


The  notation  can  be  understood  more  easily  in  the  following 
examples. 

In  Figure  2,  the  hypothesis  sequence  h1  is  that  the 

i 

event  observed  at  ti  was  due  to  signal  and  hypothesis 
sequence  h*  denotes  that  the  event  at  t!  was  due  to  noise. 
The  sequence  hj  is  composed  of  the  single  entry 
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h1  =  {h^DJ  =  {1} 

x  1 


(29) 


where  hi(l)  is  the  value  of  the  sequence  h1  at  time 

i 

tj  .  Values  are  assigned  as 


h.Nt(i)  =  1  :  event  due  to  signal 

J  v  / 


event  due  to  noise 


By  the  same  method,  the  sequence  h1  is  defined  as 


(30) 


h  o  =  { h  o ( 1 ) }  =  {0} 


(31) 


The  upward  branches  on  the  tree  denote  data  points  which  we 
associate  with  the  signal  process  and  the  downward  branches 
are  for  noise  caused  data  points. 

When  a  second  data  point  is  observed  at  t2  ,  there  are 
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Figure  3.  Hypothesis  Sequences  for  Two  Measurements 


four  possible  sequences  to  describe  the  origination  of  the 
two  data  points  over  the  interval  [tn,t),  t2<t  .  These  can 
be  shown  as  in  Figure  3. 

From  Figure  3,  it  can  be  seen  that  under  hypothesis 

2 

h2  ,  the  first  observed  event  (at  time  ti  )  is  assumed  to 

have  been  caused  by  the  signal  process  (hi(l)=l)  .  The 

second  observed  event  is  assumed  to  have  been  caused  by  the 

2 

noise  process  (  h.,(2)=0  ).  The  entire  sequence  can  be 


explicitl.  written  as 


(32) 


:c 

k  * 

»  '■  ■_ 

h  * 

►  ^  ’  ♦ 

I 

1 


h“  =  (h“(l),hj(2)}  =  (1,0) 


In  a  similar  manner,  the  other  three  possible 
hypotheses  are  defined,  for  N^=2  ,  as 

h]  =  (0,0} 


{0,1} 


(33) 


h j  =  {1,1} 

Figure  4  shows  the  tree  diagram  for  the  time  interval 
[t0,t),  t3<t  .  For  clarity,  only  the  values  of  one 
sequence  are  labeled. 

From  this  example,  it  can  be  seen  that  for  any  time 
interval  [tfl,t)  ,  there  are  exactly  2Nt  possible 
sequences.  These  sequences  describe  all  of  the  possible 
ways  in  which  a  signal  process  and  a  noise  process  could 
have  caused  the  observed  measurement  history.  Thus,  we  need 
only  consider  a  finite  (although  growing)  number  of  possible 
models  as  described  in  the  last  section.  The  close  analogy 
to  the  time  varying  parameter  case  of  reference  7  can  now  be 


I 
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figure  4.  Hypothesis  Sequences  for  Three  Measurements 


seen . 

In  order  to  use  MMAE  for  the  point  process  signal  plus 

Nt 

noise  problem,  we  associate  each  hypothesis  h.  with  a 

J 

distinct  model  of  the  observed  process  and  expand  our 
hypothesis  definition  of  equation  (15)  to 


h .  L  ell,  j  e  0,1,2,  ...  ,  2  *-1 
J 


(34) 


For  the  time  interval  [t0,t)  ,  the  measurement  history 
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4 


N. 


*=  { (t  i ,  r  i ) ,  ( 1 2 ,  ) , 


•(tNt'rNt» 


(35) 


Nt 

The  number  of  events,  ,  is  implicitly  included  in  Z 

We  can  apply  equation  (24)  to  obtain  the  expression  for  the 
optimal  estimate  of  £(*) 


(2Nt)-l 

x(t)  =  J  Pr[h  Nt|ZNt]^,(t)  (3C) 

j=0 


where 


±  A  N.  N, 

Xj(t)  -  E(x(t)|hJ  \z  Z} 


(37) 


Because  of  the  assumption  that  the  signal  and  noise  are 

independent,  we  can  ignore  all  observed  data  points  which, 

based  on  the  hypothesis  sequence  h.^  ,  are  caused  by 

noise.  Equation  (37)  is  the  estimate  of  x(t)  obtained  by 

only  considering  those  data  observations  for  which 

h1Nt(1)=1  for  i  e  1,2  3  ...N  •  These  individual  filters 
J  *  *  *  *  t 
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are  thus  "tuned"  to  the  respective  hypothesis  sequences. 

The  structure  of  the  overall  estimator  is  the  same  as 
shown  in  Figure  1,  however,  now  the  form  of  the  individual 
estimators  depends  on  the  particular  point  process  signal 
under  consideration.  For  the  conditionally  Poisson  point 
process  signal  model  described  in  Section  II. 2,  the 
individual  estimators  are  the  Snyder-Fishman  filters 
described  in  Section  II. 3. 

A  significant  difference  between  sequence  hypothesis 
MMAE  and  the  constant  parameter  linear  Gaussian  MMAE  case 
described  previously  is  that,  in  the  sequence  case,  the 
number  of  filters  doubles  with  each  observed  data  point. 
There  is  a  close  analogy  between  sequence  hypothesis  MMAE 
for  a  point  process  model  and  the  time  varying  parameter 
multiple  model  adaptive  estimator.  The  growing  number  of 
filters  places  a  serious  computational  burden  on  the 
estimator.  Methods  of  alleviating  this  burden  are  considered 
in  Chapter  V. 

II. 4. 3  MMAE  for  the  Particle  Beam  Problem.  Equation 
(36)  defines  the  overall  estimate  of  x(t)  and  Snyder  and 
Fishman's  results  (equations  10-13)  provide  the  means  for 

A 

evaluating  the  individual  model  estimates  x.(t)»  The  only 

3 

remaining  term  to  specify  analytically  is  Pr [h | ,  the 
weighting  factor. 

To  simplify  notation,  denote  a  single  space-time 
observation  as 


41 


z(i)  =  (ti,ri) 


X  'J 


(38) 


and  the  measurement  history  over  [t0,t]  as 


Z  1  =  {z(l),z(2),...,z(Nt)} 


(39) 


Note  that  the  sub-sigma  field  B  includes  all  th 

N 

information  in  Z 

In  terms  of  this  notation,  the  goal  is  to  evaluate 


N+  N, 

Prlh.  t|Z  t] 


“  Prthj  t(l),...,hJ  r(Nt)|z(l) _ ,z(Nt)]  (40) 


The  probabilities  of  equation  (40)  can  be  combined  into 
discrete  probability  density  function.  We  can  use  Bayes 
rule  to  obtain  (Ref.  33:176) 


2  su-h  "Vrth..  V't]  = 

3  3  p(ZlNt) 


(41) 


The  denominator  of  equation  (41)  is  a  probability  density 
function  evaluated  at  .  It  is  the  probability  density 
of  obtaining  the  specific  realization  ZNt  over  the  interval 
[to ,t )  ,  tN  <t<tN  +1  ,  from  the  space-time  conditionally 
Poisson  process  with  rate  parameter  X(t,r,x(t))  defined  by 
equation  (8).  This  density  is  termed  the  sample  function 
density  (sfd).  Snyder  (Ref.  4b)  develops  the  sample 
function  density  for  a  temporal  (no  spatial  dependence) 
Poisson  process.  The  brief  development  of  the  sample 
function  density  presented  here  for  a  space-time  Poisson 
process  follows  Snyder's  method. 

Because  the  process  in  question  is  Poisson  conditioned 
on  knowledge  of  x(t)  and  any  uncertainties  in  X_(»)i  we 
first  consider  the  (trivial)  case  of  x(t)  and  X  (•)  known 
exactly.  The  random  case  will  be  discussed  at  the  end  of 
this  section.  Two  additional  pieces  of  notation  will  be 
useful  for  developing  the  sample  function  density.  Let 


(42) 


( 


1 


Nt(y) 


=  N( [t  o , t )  X  Y) 


1 

I 


where  Y  9Rm.  Thus  N^CY)  is  the  number  of  space-time  point 
events  observed  on  [to,t)X  Y  .  Note,  that  if  Y  =  R  , 
then  N. (Y)=N.  as  defined  in  equation  (1).  For  t0-v<t  , 
let 


1 


Nvt(Y)  =  Kt(Y)  -  NV(Y) 


(43) 


! 


« 


i 


Since  we  have  assumed  a  space-time  Poisson  model,  Nv^(Y) 

is  Poisson  distributed  with  rate  parameter  A(t,r,x(t)). 

We  begin  by  writing  the  probability  that  the 
Nt 

realization  c,  of  the  process  occurs  within  some  small 

Nt 

space-time  volume  which  includes  Z 


PrU  e[Z  ,Z  X+AZ)] 

—  Pr[x  i ,  e[t  i ,  t  x+At  i  ),...,  t[tj^  ,tj^ 

t  t  t  "fc 


A.i£c(ri,pi),...,'tj»  ec(r„  > P«  )3 


(44) 


Nt  _  N 

where  £  ,  and  are  the  dummy  variables  for  Z  z 

and  ri  respectively,  ie{l,2,3,  ...  Nt>  ,  and  the 

observation  interval  is  [t0,t)XRrn  .  The  cubes  in  Rm  are 

defined  as  before  except  they  are  now  indexed  in  time  by  i: 


Ji<7i 


’pi> 


A  r 

=  [rii 


Pil+Pil>X* 


. X [r .  , r .  +p . 

im’  lm  Kim 


A 

)  = 


(45) 


Equation  (44)  can  be  equivalently  written  as 


N  N  N 

PrU  te  [Z  t,Z  t+AZ )  ]  = 


19 


Pr[N 


to.t/^  "  °*  Nt1,t1+At1(riECl>  =  1'Nt1,t1+AtI(r^Cl)=0' 


Nti+Atx  ,t2(Rm)  =  °*  Nt2,t2+At2<r2eC2>  =  1,Nt2,t2+At2^r2^C2^~0' 


(46) 


Nt,tNt  +  AtNt.(rNt^CNt)  0’NtNt+AtNt,t(RR1)~°^ 


45 


we  can 


Because  N  t(Y)  is  distributed  as  a  Poisson  process, 
use  the  independent  increment  property  to  factor  the  right 
hand  side  of  equation  (46)  as 


PrCNt.,t1(Rm)s50]Pr[Nt1,t,  +  At1(r‘ECl)’:L:|Pr[Nt1,t1  +  fit1<r^c')”0:l 


•PrCNt1  +  At1,t2tRm)=0]-"Pr[NtN  ,tN  +AtN  <VV”°]  <47> 


-Pr  [N 


t  +At  t^  =  0] 


Each  of  these  probability  terms  can  be  written  in  integral 
form  and  the  terms  collected  to  arrive  at 


Nt 

PrU  e[Z 


N,  N. 


■AZ)] 


Nt 

=  n 

i=l 


X(x ,k , x( t ) )d<xdi 


(48) 


•  exp 


X(x  ,/l,x(t)  )dtdi 


With  this  expression,  v/e  can  now  develop  the 
probability  density  function.  The  sample  function  density 
is  defined  as  the  limit  of  the  probability  (defined  by 
equation  (48))  divided  by  the  incremental  space-time  volume 
as  the  space-time  volume  goes  to  zero  (Ref.  46:58): 


where  the  arguments  of  A(t.,r,x(t))  have  been  dropped  for 

t  h 

simplicity,  and  Pij  is  the  j  element  of  p  as  in  (45) 

As  written,  the  limit  in  equation  (49)  is 

indeterminate;  however,  by  repeated  use  of  I’Hospital's 
rule  and  Leibnitz’s  rule,  the  limit  can  be  evaluated  as 


4 


■3—  ■  V.-  -- 


p(Z  °) 


n  X(ti,ri,x(ti)) 
i=l 


exp 


-//  a(t  ,4,x(t  )  )d*.dx 
A  T.m 


t„R 


(50) 


f°r  tNt-t<tNt+1 


Note  that  x(t)  is  assumed  known  in  equation  (50).  For 
subsequent  use,  we  note  that  equation  (50)  can  be  factored 
into  the  recursion 


Nt  -  - 

p(Z  )  =  X(tN  ,rN  ,x(tN  ) ) • exp 

t  t  t 


-  f  I: 

V1  R 


Xd/tdx 


N  -1 
P(Z  t  ) 


(51) 


for  S  t  <  tNt+1 


where  p(z 


N  -i 

u  )  is  evaluated  at  time 


V1 


We  now  turn  to  evaluation  of  the  numerator  in 


(41)  for  values  of  h-h 


By  the  use  of  Bayes 


can  write 


equation 
rule,  we 
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Nt  A. 


Nt  Nt 

p(h.  \Z  -)  =  p(hj  t(Nt),...,hj  t(l),z(Nt),...,z(l)) 


N. 


N, 


NJ 


pdh.  L(Nt),z(Nt)|hj  u(Nt-l),...,hj  (1)  ,z(Nt*-l) 


N, 


N. 


p(h.  t(l),z(Nt-l), . 


C  0 


9 


N  N.-l  N.-l  N.-l  N.-l 

“  p(h.  l(Nt),z(Nt)|hri:  ,Z  1  )p(hyZ  ,Z  1  ) 


At  this  point,  the  j'  notation  requires 

N 

‘  t 

explanation.  The  sequence  h^  is  defined  as 


,lv*" 


Nt 

:  if  h.  x(Nt)  =  0 


Nt  < 

hJ  =  ' 


I  Nt"1  Nt 

yjh  yz  ,lj  :  if  hj  x(Nt)  =  1 


(52) 

MD)- 

.  MD) 

(53) 

(54) 

some 


(55) 
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The  variable  j  is  an  index  for  keeping  track  of  the 
sequences.  As  each  new  data  point  is  observed,  the  number 
of  possible  sequences  doubles  and  the  numbering  (indexing) 
of  nearly  every  sequence  changes  as  shown  in  Figures  2,3 
and  4.  The  only  sequence  which  does  not  change  index  value 
is  the  "all  noise"  sequence.  Therefore,  the  value  of  j  is 
almost  never  equal  to  the  value  of  j1  in  equation  (54). 

This  complicated  notation  is  necessary  due  to  the 
expanding  number  of  sequences.  The  concept  can  be  stated 
clearly  in  words  as: 

N  -1  Nt_1 

h.^  is  the  sequence  (out  of  2  possible 

J 

sequences)  which  is  concatenated  with  the  sequence 

Nt  Nt 

value  h.  (N,  )  to  obtain  the  sequence  h. 

J  ^  J 

The  actual  values  of  the  indices  j  and  j '  are  not  important 
to  the  MMAE  problem;  it  is  only  necessary  that  the  correct 
sequence  be  identified.  Obviously,  the  value  of  the  indices 
is  important  to  the  implementation  of  the  estimator. 

We  can  now  combine  equations  (54)  and  (51)  to  show  that 


p(h^t>zNt) 

]  p(ZNt) 

N.-l  N  -1 

(N+),z(N. )|h  *  ,z  ) 

N.-l  N.-l 
P(h *1  >  Z  * 

X(tN  'rVX(tNt))eXP  -  J  K 

L  V-l  R 


Ad/tdi 


N  -i 

p(Z  X  )  (56) 


where  the  upper  limit  of  integration  on  the  time  integral 
has  been  changed  from  t  to  t^  to  reflect  the  fact  that  we 

are  interested  in  evaluating  Pr [h ,Nt | ZNt]  immediately 

J 


after  having  observed  the  point  event  at  t 


Nt  • 


From  Bayes’  rule  in  the  form  of  equation  (41)  it  can  be 
seen  that  equation  (56)  is  recursive  : 


Nt  N. 

Pr[hj  1|Z  l]  = 


N.-l  N.-l 

«  rj  V*  ' 


P(h.  w(Nt),z(Nt)  h j -  ,Z  u  ) 

tNt 

^(tN  ,rN  ,x(tN  ) )e.\p  -/  J  'a d”dx 

t  ‘  t  t  / 

-  ''XT 


(57) 


N.-l  N.-l 
•  Pr[hj  |Z  1  ] 

The  one  remaining  term  to  specify  for  completion  of 


this  MMAE  example  is  the  numerator 


N.  _  N.-l  N  -1 

pChj  t(Nt),z(Nt)|hj.  T  ,Z  ) 


(58) 


The  method  for  evaluating  this  mixed  (discrete  and 
continuous)  density  is  similar  to  that  used  to  develop  the 

sample  function  density.  There  are  two  possible  values  for 

N+  Nt 

h.  (N  )  ,  0  or  1.  Consider  h.  (N,  )  =  0  first: 

i  t  J 


Pr[hj  <V  =0,z(Nt)E[tNt,tNt+At)XcNt 


N  -1  N  - 

V  -Z 


1‘ 


=  Pr  no  signal  event  in  [tN  _1»tN  )  and  only 
L  "t  t 

one  noise  event  in  ^ 

N  -1  N.'-l  I 

at  ^tN  ,tN  +At)XcN  |h y  ,Z  J 
t  t  t 


(59) 


In  equation  (59),  the  conditioning  is  on  a  specific 
Nt-i 

value  of  h.^  and  an  observed  realization  of  the 


t  0' 

B 


« 


Nt-l 

measurement  history  Z  z  .  Since  these  are  both  values 
(or  realizations  as  opposed  to  functional  forms  dependent  on 
Ag  or  ),  and  since  the  signal  and  noise  processes  are 
independent,  we  can  factor  equation  (59)  as 


[N  -1  N  -l"| 

no  signal  event  in  [tN  _1,tN  )|h..t  ,Z  t  J 


•Pr  f< 


jonly  one  noise  event  in  [tVT  ,  ,t„  +At  ) 
|_  u  Nfl  Nt  ' 


at 


r  N.-l  N.-ll 

CtNt,tNt+At)XCNt'hj"  ,Z  J 


(60) 


(If  we  were  not  given  values,  the  joint  conditional 
probability  density  is  not,  in  general,  factorable.)  By  the 
independent  increment  property  of  the  signal  and  noise 
Poisson  processes,  the  conditioning  can  be  dropped  resulting 

in 


Pr[ •  ]  =  Prj^no  signal  event  in  [t 


•Pr  only  one  noise  event  in  [t„  „,t„ 

L  Nt-1’  Nt 


+At  ) 


at 


^tNt’tNt+At)XcNt 


(61) 
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« 


where  the  arguments  of  X  and  X  have  been  dropped  for 

s  n 

simplicity.  We  can  obtain  the  density  for  eauation  (58) 
Nt 

when  h.  ( N  ) = 0  (as  we  did  for  the  sample  function  density 

J  ** 

development)  by  a  limiting  method  : 
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By  repeated  use  of  1' Hospital’s  rule  and  Leibnitz's  rule, 
this  reduces  to 


Nt  _  N  -1  N.-l 

P(hj  (Nt}  =  °.z(Nt)lhr  t  ,Z  t  ) 


Xn(tNt>rNt>  exp 


'N. 


-  /  / 


Xd/idt 


^  tNt  -1  Rm 


(64) 


where 


X  =  X(t,r,x(t))  =  Xs(t,r,x(t))  +  Xn(t,r)  (65) 


By  the  same  argument,  it  can  be  shown  that 


< 


N.  _  N.-l  N.-l 

p(h.  t(Nt)  =  l,z(Nt)|hr  ,Z  r  ) 


"  tNt 

=  As(tNt*FNt>^(tNt))exp  -J  Jm 

l  4Nt-l ‘ 


Xd^dxl  <66) 


We  now  have  all  the  pieces  necessary  to  calculate  the 
weighting  factors  in  equation  (36).  By  substituting 
equation  (64)  or  (66)  as  appropriate  into  equation  (57),  it 
can  be  seen  that 


’  *n(W 

X(tNt’rNt,x(tNt)) 


N.-l  N.-l 
Pr  [h^ '  lz  ] 


Nt  N 

Pr[h.  t|Z  t]  = 


if  h.  J(Nt)  =  0 


X s(tIVrN  ’x(tN.  N  -1  N  -1 

- - V-1! - —  ’  Prlh.S  |  Z  1  ] 

X(tN  > )) 
wt  t  t 


(67) 


if  h.  t(Nt)  =  1 
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where  the  denominator  is  given  in  equation  (65). 

Thus,  equation  (36)  defines  our  multiple  model  adaptive 
estimator,  where  the  weighting  factors  are  defined  by 

A 

equation  (67)  and  x.(t)  is  the  Snyder  and  Fishman  filter 
estimate  of  x(t)  given  the  jth  hypothesis.  This 
development  is  for  the  case  of  a  known  x(t)  ,  as  assumed 
just  prior  to  equation  (42).  The 


Pr  [  •  ] 


s  or  n 

Ao  +  An 
s  n 


Pr[-] 


(68) 


form  provides  insight  into  the  nature  of  the  weighting 
process  and  a  similar  form  will  be  seen  in  the  estimator 
developed  in  Chapter  IV. 

It  is  trivial,  however,  to  estimate  x(t)  when  we  have 

assumed  that  it  is  a  known  deterministic  function.  We  could 

correct  this  by  allowing  x(t)  to  be  random  and  proceed  as 

before  to  solve  for  the  weighting  factors  in  equation  (41). 

The  denominator  of  equation  (41),  the  sample  function 

Nt  _ 

density  of  equation  (50),  is  actually  p(Z  |x(t)=£(t))  in 
this  case  and  we  can  write 


p(Z  t)  =  J  p(Z  |x(t)  =  C(t))p(C(t) 


(69) 


where  "£(t)  is  the  dummy  variable  for  x(t)  .  Because  x(t) 

is  defined  as  the  output  of  a  linear  Gaussian  system, 

p(x(t))  exists  and  the  integration  in  equation  (69)  can, 

in  principle,  be  performed  although  it  is  complicated  and  a 

closed  form  solution  might  not  be  possible. 

Similarly,  the  numerator  of  equation  (41)  must  be 

developed  given  x(t)  as  random  and  then  averaging  over  the 

statistics  of  the  x(t)  process. 

A  furcher  complication  arises  when  we  add  feedback  to 

achieve  some  sort  of  optimal  control  for  the  system.  One 

form  of  the  feedback  could  be  as  a  control  input 
Nt 

c^(t,Z  )  to  the  equation 


dx(t )  =  F(t )x( t )  dt+G( t )du(t )  +  c"(t,Z 


(70) 


where  c'(t,Z  ) 


is  an  n  dimensional  control  vector 


generated  in  some  optimal  manner  from  the  observed  process. 


will  affect  the  form  of  the 


Nt 

The  presence  of  c^(t,Z  ) 
density  p(x(t))  and  may  make  the  integrations  even  more 
complicated,  if  not  intractable. 

The  serious  computational  problems  which  arise  with 
random  x(t)  and  feedback  control  motivate  us  to  consider 
modeling  concepts  other  than  the  probability  density 
approach  taken  in  this  chapter. 

11.5  Summary 

In  this  chapter,  the  physical  model  of  the  observed 
process  is  presented  in  which  a  space-time  point  process 
signal  is  observed,  corrupted  by  space-time  point  process 
noise.  The  processes  are  assumed  to  be  conditionally 
Poisson  and  statistically  independent  of  each  other. 
General  MMAE  techniques  are  discussed  and  the  equations  for 
MMAE  on  this  point  process  problem  are  developed  for  non 
random  rate  parameters  and  An  .  The  development  is 
based  on  probability  density  functions.  The  form  of  this 
estimator  will  provide  insight  in  later  chapters.  The 
difficulties  associated  with  using  a  probability  density 
function  approach  for  random  rate  parameters  are  discussed 
and  motivation  for  the  measure  theory  approach  of  Chapter 
III  is  presented. 


In  Chapter  III,  the  observed  physical  process  described 
in  this  chapter  is  modeled  as  a  doubly  stochastic  space-time 
Poisson  point  process.  This  "cross  product  space"  model  and 
a  measure  theory  approach  to  the  statistics  will  allow  us  to 
overcome  both  the  complex  integration  and  feedback  problems 
encountered  in  this  chapter. 


Ill .  Cross  Product  Space  Model 


III.l  Introduction 

The  difficulties  encountered  in  Chapter  II  in  obtaining 
useful  evaluations  of  the  equations  for  multiple  model 
adaptive  estimation  stem  from  the  basic  approach  taken.  The 
uncertainties  were  modeled  in  terms  of  probability  density 
functions  of  random  processes.  This  results  in  expressions 
which  are  very  difficult  to  evaluate  (equation  (69)  for 
example)  except  for  trivial  cases  such  as  a  known,  non- 
random  x(t)  .  The  addition  of  feedback  control  further 
complicates  evaluation  of  the  estimator  equations. 

In  this  chapter,  a  fundamentally  different  modeling 
approach  is  taken,  but  one  which  coincides  well  with  the 
physical  problem  under  consideration  and  which  will  allow  us 
to  make  use  of  the  MMAE  development  in  Chapter  II.  The 
observed  process  is  modeled  as  a  doubly  stochastic  space- 
time  point  process  on  [to,t)XRm  .  The  statistics  are 
defined  using  measure  theory  concepts  on  a  cross  product  of 
two  probability  spaces  and  feedback  control  is  included  in 
the  basic  model.  Some  necessary  results  from  Fishman, 
Reference  il,  are  presented  in  Sections  III. 2  and  III. 3, 
including  a  definition  of  regularity  for  a  doubly  stochastic 


space-time  point  process.  The  implications  of  regularity 
are  discussed  in  Section  III. 4  This  section  presents  the 
tools  necessary  for  overcoming  the  difficulties  encountered 
in  Chapter  II  in  deriving  the  weighting  factors  for  the 
multiple  model  adaptive  estimator.  The  main  result  is 
Theorem  III. 6,  the  representation  theorem,  which  provides  a 
method  for  calculating  a  posteriori  probabilities  of  the 
form  needed  for  the  weighting  factors. 

In  Section  III. 4,  an  analytic  description  of  the  beam 
point  process  estimation  problem  is  given  in  terms  of  the 
cross  product  space  model  and  a  regularity  proof  is  given 
for  this  analytical  form. 

I I I. 2  Doubly  Stochastic  Space-Time  Point  Process 

In  general  terms,  a  doubly  stochastic  space-time  point 

process  is  a  space-time  point  process  in  which  some 

parameter  of  the  process  is  itself  random.  Our  physical 

model  for  the  beam  problem  fits  this  description:  we 

observe  a  space-time  point  process  on  [to,t)XRm  and  we 

wish  to  estimate  the  state  of  the  Markov  process  which 

determines  the  location  of  the  centroid  of  the  observed 

process.  We  have  additionally  assumed  that,  given  x(t)  and 

Xn ( * ) ,  the  process  is  Poisson.  This  assumption  is  not 

necessary,  however,  for  the  results  presented  in  this 

section.  As  defined  in  equation  (4),  x(t)  is  random  and  we 

wish  to  allow  other  terms  in  X  ( • )  to  be  random. 

s' 


A  random 


noise  rate  parameter  is  also  desired  and  can  be  used  to 
model  uncertainties  in  external  noise  sources  or  in  the 
detector.  Thus,  the  particle  beam  problem  can  be  described 
readily  in  terms  of  a  doubly  stochastic  point  process. 

We  begin  a  more  thorough  description  by  defining  a 
probability  space  (^clAolPa)  where  Q  is  a  nonempty  set, 

S  S  9  S 

is  a  Borel  field  (sigma  field)  of  subsets  of  ft  and  p  is 

s  s 

a  probability  measure  on  Ag  .  This  probability  space 

corresponds  to  events  we  cannot  observe  directly.  For  all 

u»seftg  let  there  be  a  probability  space  (ft ,  B  ,P(  •  ;ws)  )where 

the  probability  measure  P(*;ws)  is  dependent  on  the  event 

in  ns  ,  B  is  a  Borel  field  of  subsets  of  ft  and  p(*;w  ) 

s 

is  a  probability  measure  on  8  If  for  every  B cB  , 

P(B>*)  is  measurable  on  (H_,A„)  then  it  can  be  shown  (Ref. 

o  o 

32)  that  a  unique  joint  probability  measure  P'  exists  on 
gxB)  (where  AgxB  denotes  the  product  algebra 
of  Ag  and  8  (Ref.  32:71))  such  that 

P"(AXB)  =  J P(B  ;u>g)  Ps(dcos)  (71) 

A 

« 

A  e  5 

s 

B  e  B 
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and  a  unique  probability  measure  P  exists  on  (0,8) 


P(B)  =  P'(0sXB) 


/p(B;“s)Ps(d“s) 


B  e  B 


In  addition,  if  W(w  ; u) )  is  a  random 

s 

(0  xO,A  ©8)  then 
s  s 


j WdP"  =  J  J 


W(u)s;a))P(du;u)s) 


0gX0 


0  0 
s 


Ps<d“s> 


(72) 


variable  on 


(73) 


[t0  ,°°)  X  Rm 


Let  U  and  V  be  mappings  from  [to,00)  X  0  to 


tc  t> 


U:[t0,“)  X  !1  +  [to,00)  X  Rm 


V:  [t  o ,°°)  X  fl  +  [ t  o , 00 )  X  Rm 


(74) 


in  which  an  event  BeB  is  mapped  into  a  set  of  space  time 

points  in  [t0,°°)  X  Rm.  The  underlying  probability  space 

for  the  process  U  is  (fl,B,P(*;w  ))  .  The  underlying 

s 

probability  space  for  the  process  V  is  (fi,B,P)  .  Note 

that  U  and  V  have  identical  pre-image  and  image  spaces.  The 

distinction  between  the  two  processes  is  that  we  explicitly 

show  the  dependence  on  cos  for  process  U  . 

This  definition  of  a  doubly  stochastic  space-time  point 

process  as  a  mapping  from  a  cross  product  of  probability 

spaces  gives  us  a  convenient  framework  for  describing  the 

particle  beam  problem.  We  observe  point  events,  z(i)  , 

iel,2,...,N+  which  are  generated  by  a  conditionally  Poisson 

process.  The  rate  parameter,  X(*),  of  the  Poisson  process 

is  defined  by  equation  (8)  and  is  itself  a  function  of  the 

random  process  x(t)  and  >  (•)  .  If  we  let  the 

probability  space  (ft  A  P  )  specify  the  random  nature  of 

v  s 1  s 9  s 
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x(t)  and  (the  potentially  random)  X  (•)  f  then  given 

u)  ,  the  process  U  is  Poisson.  Furthermore,  the  doubly 
stochastic  space-time  point  process  V  is  defined  for  this 
application.  Figure  5  summarizes  this  modeling  concept. 

III. 3  Regularity 

Several  important  results  can  be  obtained  if  the  doubly 
stochastic  space-time  point  process  (modeled  as  a  mapping 
from  a  cross  product  of  probability  spaces)  is  regular.  We 
first  consider  the  definition  of  regularity  and  then  some 
useful  implications  of  regularity.  Some  preliminary 
definitions  and  notation  are  necessary  prior  to  defining 
regularity . 

Definition  III-l.  Let  t  be  an  element  of  [t0,T)  and 

let  N  be  defined  as  before.  Let  8,  denote  the 

t  L 

subsigma  field  of  8  generated  by  the  random  variables 

{Nt,(ti,r^,  ...  ,(tN  ,rN  )}  0 

t  t 

This  can  be  interpreted  as  the  sigma  field  generated  by  the 
sample  paths  of  the  point  process  up  to  time  t. 
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Definition  III-2.  A  space-time  point  process  is  called 
conditionally  orderly  if  for  each  point  (t , r )e [t o ,T) X  Y 


lim  _ 

V  Pr[N([t,t+At)Xc(r,p))-2|B.](io) 

At+O - - - 5 -  =0  (75) 

Pr [N( [t , t+At )Xc(r , p))-l|8.](w) 

p+0  x 

w.p.  1 

whenever  the  denominator  does  not  vanish.  E 

Conditional  orderliness  essentially  guarantees  that  the 
observed  events  are  distinct;  the  probability  of  two  or 
more  point  events  occurring  at  the  same  time  and  spatial 
location  is  zero. 

Definition  III-3.  Let  z(i)e[t0,t)  X  Y  ie 

1,2,3,  ...  ,Nt>  be  defined  as  in  equation  (38)  and  let 
z(i)  denote  a  random  variable  for  the  observation  at 
time  t.  and  spatial  location  r. 

l  i 


€ 


z(i)  = 


(76) 


The  j  order  distribution  function  is  defined  for  j 

-  1  by 


Fj(ZJ)  =  Fj(z(l)  ,z(2) ,  ...  ,z(  j)) 


=  Pr[z(l)^z(l),  ...  ,  z  ( j  ) — z ( j ) j 


z( i)e  [t  o , t  )  X  Y 


(77) 


i  —  1 » 2 ,  ...  , j 


where  the  vector  inequality  is  taken  as  an  inequality 
on  each  corresponding  element  of  the  two  vectors.  ■ 


We  can  now  define  the  conditions  a  space-time  point 
process  must  satisfy  in  order  to  be  regular. 


i 


Definition  III-4.  A  space-time  point  process  which 
maps  into  [t0,t)XY  is  regular  if  the  following  four 
conditions  are  satisfied: 

(a)  Each  distribution  function 


Fj(Zj)  ^  Fj  (z(l),  ...  ,z(j)) 


(78) 


j  =  1,2,  ...  , N 


is  absolutely  continuous  on 


R(m+1) j 


(79) 


pr[(tJ+i,rJ+i)e(tj,t)XY[8t  ]<! 


(80) 


w.p.  1 


for  all  finite  t  ,  j=0,l,2,  . . . 


(c)  The  point  process  is  conditionally  orderly.  (81) 
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Pr[N([t0,t)XY)<»]  =  1 


(82) 


for  all  finite  t  (Ref.  11:79)  83 

Condition  (b)  requires  that  the  conditional  probability  of  a 
new  point  occurring  anywhere  in  Y  ,  for  finite  t  ,  is  less 
than  one;  that  is,  there  are  no  guaranteed  points. 
Condition  (d)  requires  that  the  number  of  points  in  Y  is 
finite,  for  finite  t  . 

Our  processes  will  be  modeled  as  doubly  stochastic 
point  processes.  Regularity  for  this  case  is  defined  as 
follows. 

Definition  III-5.  A  regular  doubly  stochastic  space- 

time  point  process  V: [t0 ,T)Xfi^[t0 ,T)XY  is  a  doubly 

stochastic  point  process  such  that  U  (equation  74) 

is  a  regular  space-time  point  process  for  each  m  eft 

s  s 

(Ref.  11:105).'  B 

Definition  III-5  provides  the  conditions  under  which 
the  doubly  stochastic  space-time  point  process  V  is 
regular.  In  order  to  use  the  results  of  the  next  section, 


we  require  that  V  be  a  regular  space-time  point  process 

(versus  a  regular  doubly  stochastic  space-time  point 

process).  The  distinction  is  that  for  a  regular  doubly 

stochastic  space-time  point  process,  V  ,  we  must  specify 

oj  to  insure  the  regularity  of  V  .  If  V  is  a  regular 
s 

space-time  point  process,  we  do  not  need  to  specify  w  to 

s 

use  the  results  of  regularity  (even  though  V  is  dependent 

on  w  ).  The  necessary  conditions  for  this  are  provided  by 
s 

Theorem  III-l. 

The  4>(*)  functions  in  the  following  theorem  are 
termed  hazard  functions  and  they  specify  the  infinitesimal 
properties  of  regular  space-time  point  processes.  A  hazard 
function  is  the  conditional  instantaneous  rate  of  occurrence 
of  new  events  per  space-time  volume.  The  hazard  function, 
<(>(•)  ,  for  the  general  regular  space-time  point  process 

corresponds  exactly  to  the  rate  parameter,  X(*)  ,  for  a 
Poisson  space-time  point  process.  The  existence  and 
usefulness  of  hazard  functions  are  discussed  in  Section 
I I I. 4.1.  For  a  proof  of  Theorem  III.l,  see  reference  11 
pages  10G-116. 

Theorem  III-l.  Let  V  be  a  regular  doubly  stochastic 
space-time  point  process  such  that 


(a)  If  BeB  and  te[t0,T)  then  P(B  ;<jog  |  8t )  (w)  is 

measurable  with  respect  to  the  product  sigma  field 


As0B 


(83) 


(b)  For  each  point  (t,r)e[t0,T)  X  Y 


<j>(t,r)  »  E{4»(t#r;u;u  )} 

s 


f  „ 

j  ♦(t,r;w; 


go  )P'(do)Xdu)  )<°° 


(84) 


(c)  4)(t , r ; co ; cj  )  is  measurable  with  respect  to 

S 

(85) 


m 


[t„,t)  X  R  X  8  X  A 


Then  V  is  a  regular  space-time  point  process. 
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Ill, 4  Implications  of  Regularity 

When  a  space-time  point  process  satisfies  the 

conditions  for  regularity,  it  can  be  shown  that  the 

associated  hazard  function  for  the  process  exists  (Ref. 

11:83-89).  A  sample  function  density  can  be  written  in 

terms  of  the  hazard  function  in  several  illustrative  forms. 

Of  direct  importance  to  the  estimation  and  control  problem 

for  a  point  process  signal  in  point  process  noise  is  that 

the  hazard  function  for  the  doubly  stochastic  space-time 

point  process  V  is  dependent  on  <u  and  oj  .  The  u 

s 

dependence  allows  for  feedback  control  of  the  system  from 

the  observations  in  [t  ,T)XRm  .  Regularity  also  provides 

o 

a  means  of  calculating  P  ( A  |  B  ),  the  probability  of  an  event 

s  t 

AeA  given  the  sub-sigma  algebra  generated  by  the 

measurements,  g  If  we  model,  in  q  ,  the  uncertainty 

t  s 

of  whether  the  noise  process  or  signal  process  caused  an 
observed  point  event,  then  we  can  use  Ps(A|8t)  to  derive 
the  probability  that  a  particular  hypothesis  sequence 
(represented  by  A)  occurred,  given  8^  .  This  is  the  key 
result  necessary  to  develop  the  individual  filter  weights 
for  the  multiple  model  adaptive  estimator. 

As  a  result  of  regularity,  we  can  also  develop  a  direct 
estimator  (as  opposed  to  a  multiple  model  adaptive 
estimator)  for  the  process  x(t)  when  the  observations  are 
corrupted  by  point  process  noise.  The  direct  estimator  is 


developed  in  Appendix  A.  The  multiple  model  adaptive 
estimator  approach  results  in  much  simpler  individual 
expressions  to  evaluate  than  does  the  direct  estimator.  The 
tradeoff  is  the  growing  requirement  for  calculation  and 
memory  necessary  for  the  MMAE  approach. 

The  specific  implications  of  regularity  are  presented 
in  the  rest  of  this  section.  They  are  due  to  Fishman  (Ref. 
11)  and  are  presented  without  proof  as  background  for  the 
multiple  model  adaptive  estimation  filter  development  in 
Chapter  IV. 

I I I. 4,1  Hazard  Functions.  We  let  W  be  a  space-time 
point  process,  W:  [t  0  ,T) XQ->-[t  o ,  t  )  XRm  .  If  this  space- 

time  point  process  is  regular,  that  is,  it  satisfies 
definition  III-4,  then  the  hazard  function,  <j>(t ,  r  ;w) 

exists  (Ref.  11:83-89).  The  infinitesimal  properties  of  the 
regular  space-time  point  process  are  described  in  the 
following  theorem. 

Theorem  I I 1-2.  The  following  limit  holds  (w.p.l)  for 

almost  all  ( t , r ) e [t 0 ,T)XY 


lim 


At 4-0  (Atp)-1Pr[N(  [t ,  t  +  At  )  Xc(  r ,  p  )  )-l  |  Bt  ]  (oi ) 

p-0 


lim 

=  At  +  O  (Atp)~Pr[N(  [t,t+At)Xc(r,p'))  =  l|Bt](u>) 

p-*0 


(86) 


=  4>  ( t ,  r ;  to  ) 


Proof:  (Ref  11:89) 

m 

If  a  doubly  stochastic  space-time  point  process  V  is 
regular  (Definition  III-5),  then  each  process  U  is 
regular  for  a  given  UJS£^S  .  A  hazard  function  exists  for 
each  doubly  stochastic  regular  space-time  point  process  V 
and  is  of  the  form  (t,r;w;u>s)  j  weft  ,u)s£fig  .  The 
dependence  on  ws  is  due  to  the  requirement  that  U  is 
regular  given  . 

If  the  doubly  stochastic  regular  space-time  point 
process  V  satisfies  Theorem  III-l,  then  V  is  a  regular 


76 


space-time  point  process  (note,  the  "doubly  stochastic" 
qualifier  has  been  dropped).  Because  the  space— time  point 
process  V  is  regular,  a  hazard  function  exists  in  the 
form  4>(t,r;w)  .  The  evaluation  of  T(t,r;u>)  is  presented 
in  the  following  theorem. 

Theorem  1 1 1-3.  Let  V  be  a  doubly  stochastic  space- 
time  point  process  satisfying  Theorem  III-l.  Then 
(w.p.l)  the  following  equation  is  valid  for  almost  all 
(t,r)e[t0,T)  X  Y 


4>(t,r;w)  =  <j»(t,r;w)  =  E  (<j>  (t ,  r ; 


0) 


;  u>  ) 
s ' 


A0©8t} 


(87) 


where  A0eAs  is  the  set  } 

the  empty  set. 


and  p  is 


Proof:  (Ref.  11:117) 


The  conditioning  on  the  sub-sigma  field 


Ao®B 


is 


equivalent  to  conditioning  on  the  "measurement  history"  8^ 
and  no  further  information  about  Q 

s 

Sample  function  densities  for  regular  space-time  point 
processes  can  be  written  in  terms  of  hazard  functions,  as 
shown  in  the  following  theorem. 


Theorem  I I 1-4.  Let  V  be  a  regular  space-time  point 

process  which  maps  into  [to,T)  X  Y  .  Let  te[t0,T) 
and  denote  the  sample  function  density  on  [to,t)  X  "V 


as  Lt(u))  •  Then  (w.p.l) 


Lt(w)  =  exp 


t 

-  ffcr  H ; w)d^dT  I 

.  t0  Y 


(88) 


for  Nt  =  0 


r  t 

a.s.  _  f 

Lt(od)  =  II  ♦  (ti,r.L;w)exp|  -  / 


r  t 

-  J  y*<J>(T 

-to  Y 


u)d4dx 


(89) 


N  *  1 


Proof : 


(Ref  11:92) 


Equation  (89)  can  be  written  alternately  as 


r  t 

c  c 

"t  -| 

j  Lt(ai)  =  exp 

-  I  J  <J>  (t  ,  a  ;o)  )d“dx  + 

J  ln4>(ti,ri;co) 

J  J 

to  Y 

i=l  J 

(90) 


L 

Lt(w)  =  expl  -  J  j<$>(r  ,^;w)d^dT  +  j  JlmKt  ,^;w)N(rtxXcU) 
L  to  Y  t0Y 


(91) 


where  the  last  integral  in  equation  (91)  is  called  a 
stochastic  counting  integral  and  is  defined  by 


t 

//■ 


to  Y 


1  n ( t  ,4;u))N(dxXd^t)  -  ln<J)(t  r  .  ;co) 

L—i  X  1 


(92) 
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We  can  similarly  define  the  sample  function  density  of 
the  doubly  stochastic  regular  space-time  point  process  V 
as 


and  the  N^-O  case  is  exactly  analogous  to  equation  (88). 

Furthermore,  we  can  write  the  sample  function  density 
of  the  regular  space-time  point  process  V  in  terms  of 

A 

<K*)  (recall  the  definition  in  equation  (87)). 

Theorem  I I 1-5.  Let  the  doubly  stochastic  regular 
space-time  point  process  V  satisfy  Theorem  III-l  (V 
is  regular).  Then 


80 


J 

I 

i 


i 


N+ 

^  r  t 

_  A 

Lt(a3)  =  n  4>(t^,iv;u))exp 


i=l 


-j  jilt,*-; 


w)d/idx 


1  to  y 


(94) 


=  /  Lt(«;«s)Pg(du8) 

°s 

“  Es{Lt(w^s)} 

Pro°f*-  (Ref.  11:118-119) 

IS 

With  these  preliminaries,  we  can  now  write  the 

representation  theorem.  This  gives  us  the  means  of 

evaluating  the  probability  of  an  event  AeA  (which  is  not 

s 

directly  observable)  given  8^. 

Theorem  III-6,  Representation  Theorem.  Let  a  doubly 
stochastic  space-time  point  process,  V  satisfy  the 
conditions  of  Theorem  III-l  on  [t0,T)XY  .  Then 
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Ps(fl|Bt)  - 


/ 


Lt<“^s)Ps(da)s) 


L 


(95) 


C 


Es{V^ws>K£A}Ps(A) 

Esat(u,;ws)} 


(96) 


i 


AeA, 


< 


h 


Proof:  (Ref.  11:127) 


E 


As  mentioned  previously,  it  is  the  representation 

Nt 

theorem  which  will  allow  us  to  evaluate  the  Pr[h.  is 

J 

correct  J  8 . ]  terms  in  the  multiple  model  adaptive 
estimator  posed  in  Chapter  II. 
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I 


I 


J 

! 

! 


! 


With  the  tools  of  this  section,  the  next  steps  are  to: 

(a)  Develop  an  analytical  description  of  the  particle 
beam  problem  which  fits  the  cross  product  space 
modeling  concept. 

(b)  Prove  that  the  process  U  specified  by  the 

analytical  model  is  regular  (Definition  III-4). 

(c)  Prove  that  the  space-time  point  process  V 

specified  by  the  analytical  model  is  regular  (Theorem 
III-l). 

This  is  accomplished  in  the  next  section. 

III. 5  Analytical  Cross  Product  Space  Model 

The  conceptual  cross  product  space  model  has  been 

presented  already  in  Figure  5.  In  this  section,  we  develop 

an  analytical  description  for  the  particle  beam  problem 

which  fits  the  cross  product  space  concept.  The  analytical 

description  is  necessary  for  evaluating  the  multiple  model 

adaptive  estimator  in  Chapter  IV. 

I I I. 5.1  The  Observed  Space,  ft  Let  U  be  a  random 

point  process  [t o  ,T)Xft-*[t o  ,T)XRm  .  We  assume  that  the 

process  is  Poisson,  conditioned  on  to  eft  where  ft  is 

s  s  s 

described  in  Section  III. 5.2.  We  let  the  rate  parameter  for 
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I 


I 


>1 


i  m 


the  conditional  Poisson  process  be  A(t,r;oj;oj  )  where  ooeft  , 

s 

as  eft  .  Note  that  X(*)  is  random;  however,  the  under 

s  s 

tilde  notation  has  been  dropped  because  the  dependence  on 
o)  and  o)g  is  shown  explicitly. 

The  basic  assumption  made  here  is  that  the  process  is 
Poisson,  conditioned  on  o)s  .  This  assumption  is  made 
because  a  conditionally  Poisson  process  models  the  photo¬ 
electron  events  adequately  (Ref.  13:49-55).  As  will  be 
shown  later  in  this  section,  if  we  restrict  X(*)  so  that 


t 

J  j X(x ,A;u; 


;cjs)d^dT  <  <» 


(97) 


to  Y 


for  all  t  < 


then  U  is  a  doubly  stochastic  regular  space-time  point 
process,  the  hazard  function 


<Kt,r;w;u>s)  =  X(  t ,  r  ;u  ;cog)  (98) 

exists,  and  the  dependence  of  X(*)  on  both  go  and  go  is 

s 

justified. 
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Further,  let  it  be  a 


I II. 5. 2  The  Unobserved  Space  . 

a  probability  space  in  which  ui  eft 

s  s 

cross  product  of  three  distinct  probability  spaces 


) 


) 


0« 


(ft 


s 


>Ps  > 


(which  correspond  to  the  randomness  in  the  signal 
and  hypothesis  sequences,  respectively)  such  that 
empty  set  fts  is  defined  as 


n_  -  ft. 


x  n„  x  n 


(99) 


noise , 
the  non- 


(100) 


and  the  sigma  field  A  is  defined  as 

S 


(101) 


T 


As  -  As,  9  As2  9  As, 


The  sigma  field  A  is  composed  of  subsets  of  ft  ,  and  P 


si 

is  a  probability  measure  on  A 


Sl 


Sl 

The  terms  A 


Sl 


S  2  *  S3  » 


P  P 

s2  ,  and  s3  are  similarly  defined  for  the  corresponding 

probability  spaces.  Thus,  w  is  specified  completely  by 

s 

u>„  eft-  »  “o  »  and  “Q  eno 

Sl  Sl  S2  S2  S3  S3 

Let  the  rate  parameter  for  the  observed  conditionally 
Poisson  process  be 


A(t,r;oj;ws)  =  a  ^  ( t ,  r  ;ug  #  )  Ag(  t ,  r  ;  u ;  u>g  ^  ) 


(102) 


+ao(t'r;ws3)Xn(t-r:w;ws2) 


The  signal  rate  parameter,  ir;u;wSl )  ,  is  defined  as 
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I 


c 


As(t,r;w;a)si) 


I- 


_  1 


=  A(t)exp  <  -i[r-H(t)x(t  ;o)g^  )+c(t ;  co )  ]  R”  (t) 


• [r-H(t)x(t ;u>  )+c(t; 

—  to  1 


;»)]| 


(103) 


X  <0 


where  x(t;u)  )  is  an  n  dimensional  random  variable  with 

Si 


domain  in  the  probability  space  (ft  ,A0  ,P  ) 

Si  Si  Si 


,  c(t;w) 

is  a  (possible)  feedback  control,  and  all  other  terms  in 

equation  (103)  are  the  same  as  in  equation  (3).  The  process 

x(t;w  )  ,  is  defined  as  the  solution  to 

s  l 


dx(t;ugi)  =  F(t)x(t  ;u)gi  )dt+G(t)du(t  ;o)gi  )+c^(r;oi) 


x(t0;uo  )  =  x0 

to  i 


(104) 


to  -  t 
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where  c^(t;w)  is  a  (possible)  control  input. 


k  o 


The 

difference 

between  equations  (3) 

and 

(4) 

and 

equations 

(103)  and 

(104)  is  that  in  the 

latter 

we 

now 

specify 

the  randomness  in  A  (t,r;uj;w  ) 

o  o  j 

(other 

than 

a 

possible 

control 

feedback)  via  the  probability 

space 

(f!s1-As1-PS,)  • 

There  are  two  significant  points  to  be  noted  for  future 
expansion  of  the  model.  First,  by  using  this  cross  product 
space  model,  we  could  let  other  components  of 

For  example,  we 
could  model  an  uncertain  rate  parameter  "amplitude"  as 


A  (t,r;w;oj  )  be  dependent  on  to 
s  s  i  s  i 


A(t;<o  )  in  equation  (103)  where  the  probability  space 
s  1 

(^s j  »^si  ,P«5]  )  is  exPan£ied  appropriately  to  specify  both 


A(t  ;toSi  ) 


A(t,r;co;toSi) 


and  x(t;co  )  .  Second,  because 

s  i 

is  allowed  to  be  a  function  of  to  ,  we  can 
include  feedback  control  in  this  model.  One  method  of  doing 
this  could  be  to  include  an  additive  control  of  the  form 
c(t;<o)  as  is  shown  in  equation  (103).  In  a  beam  tracking 
application,  in  which  we  are  limited  to  a  finite  sized  array 
of  photodetectors,  the  control  c(t;eo)  might  be  used  to 
adjust  the  pointing  of  the  detector  to  keep  the  current 
estimate  of  the  projected  position  (estimate  of  the  terms 
H(t)x(t;w  )  )  in  the  center  of  the  array.  If  this  were 
not  done,  the  "signal"  (maximum  intensity  point  of  the 
Gaussian  shaped  intensity  profile)  could  conceivably  "walk" 
off  the  detector  array  and  no  further  useful  information 
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r  ■ 

k 

f 


} 


S 

- 

1 


« 

s 

S 


would  be  observed  from  the  signal  process. 

An  alternate  (or  additional)  method  of  control  is 

provided  by  the  term  c"" ( t ; o> >  in  equation  (104).  In  a 

beam  pointing  application,  we  can  model  the  physical 

influence  on  the  beam  position  as  a  control  input  c'(t;w) 

to  the  process  dx(t;w  )  where  x  is  a  relative  rather 

si 

than  an  absolute  position. 

The  noise  rate  parameter  is  defined  as  a  scalar  random 
variable 

Xn(t,r;m;ws)  =  XQ(t ,  r  ;u>  ;u)S2  )  (105) 

with  domain  specified  by  the  probability  space 

(fl0  ,A^  ,P  )  .  Note  that  dependence  on  (feedback 

S2  S2  S2 

control)  is  allowed  in  this  model.  As  noted  before,  the 
regularity  proof  requires  that: 


to  Y 


W  )d/E.dT  <  ® 
S  2 


(106) 


E{Xn(t,r,w;u>s^)}«» 


t  <°° 

YCR® 
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The  probability 

space 

(fl  ,  A  _ 
v  s3*  s 3 

,P  )  and 

s  3 

the 

coefficients  ao(t,r;w  ) 

s  3 

and 

«i(t,r;wSs 

)  allow  us 

to 

describe  analytically  the  conditioning  in  expressions  of  the 
form  (for  example) 


Eg{ ( • ) j  A)Pg( A)  = 

A 


/ 


(.)Ps(du»s) 


(107) 


where  (•)  is  an  event  of  interest  and  AeA  is  the  event 

Nt  Nt 

associated  with  one  of  the  2  possible  sequences  h . 

O 

Evaluation  of  equations  of  the  form  shown  above  *  are 
necessary  in  Chapter  IV  to  derive  the  multiple  model 
adaptive  estimator  equations. 

Let  («s  .*s  .Ps  )  be  a  discrete  probability  space 

where  each  u>  eQ  is  associated  with  one  of  the 

S3  S3  N 

hypothesis  sequences.  Note  that  at  time  t,  there  are  2 

Nt  Nt 

possible  sequences,  h.  ,  jel,2,  ...  2  ,  and  that  the 

J 

number  of  possible  events  in  doubles  as  each 

s  3 

new  measurement  is  observed. 
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a  coefficients  are  defined  as 


a0(t,r;cos 


«i(t,r;a) 


a0(t,r;u) 


) 


/ 

0  At  measurement  times  and 
locations  when  it  is  given  (or 
assumed)  that  the  event  at  time 
t  was  due  to  signal. 

I  (108) 


y  1  Otherwise. 


)=  /  0  At  measurement  times  and 
3  locations  when  it  is  given  (or 

assumed)  that  the  event  at  time 
t  was  due  to  noise. 

< 

(109) 

'  1  Otherwise. 


t0  -  t 


)  =  ax(t,r;u>  )  =  1 

3  Sj 


(HO) 


If  there  is  no  assumption  (or 
given  information)  concerning 
which  process  (noise  or  signal) 
caused  the  observed  events. 


An  example  will  help  clarify  the  notation.  For 
observation  on  [t0,t)XY  ,  let  =  3  ,  and  Z3  = 
{ (t i ,r i ) , (t2 , r2 ) , (t 3 , r J } , 1 0<t  i <t2<t 3 <t  .  We  desire  to 
evaluate  an  equation  of  the  same  form  as  equation  (107). 
The  assumed  hypothesis  sequence  associated  with  the  event  A 
is,  for  example, 


h  3  »  {h  3 ( 1) , h  3 (2 ) , h  3 (3) } 
6  6  6  6 


=  {1,1,0} 


(HD 


=  (signal  at  tj,  signal  at  t2,  noise  at  t3} 


Recall  that  the  superscript  on  the  hypothesis  sequence  is 

the  number  of  space-time  events  that  have  been  observed  and 

the  subscript  is  the  index  of  sequence  under  consideration, 

Nt 

out  of  a  possible  2  possible  sequences.  The  associated 
space-time  history  of  the  a  coefficients  is  as  shown  in 
Figure  6.  Because  a0(t,r;u)  )  =  0  at  (ti,ri)  and 

(t2,r2)  t  the  observed  events  at  those  times  could  only  have 


Figure  6.  The  a  Coefficients 


<3‘ 


4 


been  caused  by  the  signal  process.  Similarly,  the  event 

observed  at  (t3,r3)  could  only  have  been  caused  by  the 

noise  process.  The  a  coefficients  are  specified  as  having 

a  value  of  "1"  between  measurement  times  because,  even 

though  no  point  event  occurred  in  these  intervals  for  the 

observed  realization,  a  point  event  could  have  occurred. 

Nt 

The  conditioning  on  h  specifies  only  what  can  or  can  not 

occur  at  the  observed  event  times. 

We  now  have  an  analytical  method  for  modeling 

conditioning  on  some  particular  hypothesis  sequence.  In 

order  to  use  Fishman's  results,  we  must  first  prove  that  the 

space-time  point  process  specified  on  the  probability  space 

(fi ,  8  ,P(  •  ;o)s)  t  and  described  analytically  as  above,  is 

regular,  conditioned  on  w_,. 

s 

Theorem  1 1 1-7.  Let  (ft ,  B  ,P(  •  ;u>„) )  be  a  probability 
. 1  & 

space  and  for  each  m  let  U  be  a  random  point 

process  U  :  [t  0  ,T)  Xft-»[t  o  ,T)XY  ,  which  is  Poisson 

conditioned  on  knowledge  of  w  eft  .  Let  the  rate 
parameter  of  the  conditionally  Poisson  process  be 
A(t,r;w;ws)  as  in  equation  (102)  and  let 

(flg,A  ,P  )  be  a  probability  space  which  is  defined 

as  a  cross  product  of  three  probability  spaces  as 
described  by  equations  (99),  (100),  and  (101).  If 
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I 


t 


(t,r;w;w 


S3 


)cltdT 


to  Y 


<  CO 


to  -  t  <  00 
Y  C  Rm 


(112) 


then  the  space-time  point  process  U  is  regular  given 
U)s£fis  . 

£2 


Proof.  To  prove  Theorem  I I 1-7,  we  must  show  that  the 

four  condicions  of  Definition  III-4  are  satisfied  for  the 

analytical  description  given  of  (^s»^s»ps)  •  Note  that 

this  theorem  states  that  the  point  process  U  is  regular 
given  to  eft  therefore  in  this  proof,  to  ,  to  ,  and  to 

“  “  S1  S2  S  3 

are  all  known  events  in  the  analytical  model  of 

X(t  ,r;to;us)  . 


on 


(a)  To  show  that: 
R(m+l)j  . 


Fj(Zj) 


is  absolutely  continuous 


By  equation  (77),  the  distribution  function  is  defined  as 


=  Pr[z(l)  ±  z(l),  ...  ,z(j)-z(j)]  (113) 


V/e  can  write  this  in  terms  of  a  3  order  joint  density 
function.  The  representation  is  formal  in  that  we  do  not 
know  if  the  density  exists. 


z(j)  z(l) 


ij(ZJ)  =  /./  f(z(l),...,z(j))dz(l)...dz(j) 


—  00  —00 


where  the  integrals  are  interpreted  as 


z(i) 


t<  *i, 
i  i  m 


///  •d^  . .  .d-tiidxi 


to-00  -00 


(114) 


(115) 


if:  Y  =  Rm 


i  1 > 2  t  •  •  •  >  j 


I 


and  the  integral  signs  and  differentials  are  nested  from  the 
innermost  pair  outward.  If  YCRm,  the  elemental  integrals 
in  equation  (115)  are  over  the  regions 


(^eY)^^  -zk(i)) 
k 


(116) 


k  -  1,2, ... ,m 


The  density  in  equation  (114)  is  the  probability  of 
{z (1) , . . . , z ( j )}  falling  within  some  infinitesimal  volume 
about  .  This  is  precisely  the  sample  function  density 
given  in  equation  (50)  and  repeated  here: 


p(ZJ)  =  [[  X(tjL>ri,a);u)s)exp  ~y*  J'xCx  ;u  ;ug)dA.dx  (117) 


t0  Y 


Since  is  given  in  the  conditions  for  this  theorem,  the 

process  U  is  conditionally  Poisson  and  this  expression  for 
f(ZJ)  is  valid.  The  sample  function  density  exists.  By  the 
Radon-Nikodym  Theorem  (Ref.  54:34),  if  the  density  f(ZJ) 
exists,  then  F.(Z^)  is  absolutely  continuous  with  respect 

•J 

to  and  tjie  condition  is  satisfied. 


(b)  To  show  that:  Pr[(ti+1,ri+1)e(t.,t )XY|Bt  ]  <  1 


w.p. 1,  i  =  0,1,... 


PrC(ti+1,ri+1)£(ti,t)XY|Bt  ] 


l-Pr[(ti+1,ri+1)^(ti,t)XY|Bt_] 


=  l-Pr[N(Ctj.,t)XY)=  o|B  ] 


=  l-Pr[N([ti,t)XY)  =  0] 


(118) 


where  the  last  step  is  due  to  the  independent  increment 
property  of  the  Poisson  distributed  process.  Also,  from  the 
Poisson  statistics 
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I 


Pr[N( [ti , t )XY)  =  0]  =  exp 


-ffH  T  ,^.;w;a)s)d^dT 


.  t.  Y 


9 


~  exp 


-ffi 


ai  (t  ,a;ojS3  )Xs(t  ,A.;o);aJSi  )d/idx 


ti  Y 


(119) 


'// 


{  f> 


ao(T,A;<d  )X  )d>.dx 

S  3  n  S  2 


In  order  to  prove  this  condition,  it  is  necessary  to  show 
Pr[N(  [t^t  )XY)  =  0]>0.  Equation  (119)  can  only  equal  zero  if 
the  exponent  is  infinite.  Because  the  rate  parameter  for 
the  signal  process,  X  (t,F;ui;w  )  ,  is  Gaussian  shaped 

S  Si 


y*a  i(t,*; 


;w_  )X  (t,*;w;u  )d*>  /  X_(t, )d* 


s3 '  s 


si 


*/ 


s  l 


(120) 


m 

=  A(t)(2TT)2|R(t)!4  <  » 


Note  that  the  Gaussian  shaped  intensity  profile  for  the 
signal  rate  parameter  yields  a  convenient  evaluation  of 
equation  (120)  and  allows  us  to  use  the  Snyder-Fishman 
filter  results  for  the  elemental  estimators  of  the  MMAE. 
The  Gaussian  shape  itself,  however,  is  not  necessary,  as 
long  as  the  conditions  of  this  theorem  are  satisfied  and  we 
use  an  appropriate  elemental  estimator.  We  now  need  the 
previously  mentioned  condition 


t 


// 


<xo(t,*;w  )X  (t,A;u;w  )d4.dx«» 
s  3  n  s  2 


t 


0 


Y 


(121) 


100 


each  have  a  maximum 


Since  a0(t,r;u)  )  and  cti(t,r;oj  ) 

S3  03 

value  of  one,  the  entire  exponent  is  finite  for  finite  t. 
Therefore 


Pr[N([ti,t)XY  =  0]  =  exp 


~  f fK(<T  ,fL^’ 


;ojs)ci'idT 


H  Y 


>  0 


(122) 


Substitution  of  equation  (122)  into  equation  (118)  results 
in 


pr[(ti+1,r.+1)e(ti,t)XY|8t  ]  <  1 


(123) 


w.p.  1 

(c)  To  show  that:  The  point  process  is  conditionally 

orderly : 

To  simplify  notation,  let 


(124) 


N( At , p  )  =  N([t,t+At)Xc(r,p>) 


With  this  notation,  we  can  write  the  expression  to 
evaluated  (from  equation  (75))  as 


lim 

At^O 

p+0 


Pr[N(At,p)-2jBt] 
Pr[N(  At ,  p")-l  |  8t  ] 


lim 

l-Pr[N(At,p)=0| B. ]-Pr[N(At,p)=l|B+] 
=  At-*0  - L__ - L_ 

1-Pr [N( At , P )  =  0 1 B,  ] 

p->-0  t 


be 


(125) 


lim 


l-Pr[N( At , p )  =  0]-Pr[N( At , p )  =  l] 

=  At->-0 - - - 

l-Pr[N( At , p )  =  0] 

p+0 


(126) 


where  the  last  step  is  due  to  the  independent  increment 
property  of  the  Poisson  point  process.  Since  the  point 
process  is  Poisson, 

lim 

At^O  Pr[N(At,"p)  -  1] 

p->0 


lim 

t+At 

At-*0 

1  I  A(t  ,X;w;o)s)d/!.dT 

(127) 

p~>0 

t  c(r,p) 

and 


lim 

At-0  Pr  [N(  At  ,"p  )  =  0] 

P-0 


lim 

=  At->0  exp 

p->0 


t+At 

-/  / 


X(t  )d-tdx 

s 


t  c(r,p) 


1 


(128) 


f)' 

therefore,  when  we  attempt  to  take  the  limit  of  equation 
(126),  the  result  is  indeterminate 


1-1-0  _  0  (129) 

1-1  0 


In  order  to  evaluate  equation  (126),  we  must  use 
L'Hospitals  rule  and  Leibnitz's  rule,  with  respect  to  Pi 
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(the  xirst  component  of  p  )  first*. 


lim  _ 

Pr [N( At , P )-2 | B  ] 

At-*0  - - - 

Pr [N( At , P )-l | B+ ] 

p-0  ** 


-  4  Pr[N(At,p)=0]-  JL. 


lim  1 im 


Pr[N(At 


lim  lim 


=  At-*0  Pi-^O 

p2->0 


6e~% 


pm-*0 

m 


to 


where 


t+At 


*/  /  A(t  ,/t.  ;oj  ;o)s)d4dT 

t  c(r,p) 


(132) 


(133) 


and 


t+At  r2  +  P2  VPp, 


=  f  f  f  X(t  ,r  i  +  p  \  ,Kz  , 

r 


t  Tl 


t 

t 


h  o 


s 


•  ♦  •  » 4.^  >  w  >  Wg )  d^i^.  •  •  d-t  2  dx 


From  equation  (133),  it  can  oe  seen  that 


lim 


Pi->0 


0  =  0 


(134) 


(135) 


and  the  limit  of  equation  (132)  can  be  taken  with  respect  to 
Pj  and  we  have  the  desired  result. 
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lim 


Pr  [N(  At ,  p  )-2  |  8  ] 

At+O  - - - —  =  0 

Pr [N( At , p )-l ( B,  ] 

p->0  1 


The  process  is  conditionally  orderly. 


(d)  To  show  that:  Pr[N(  [t  o ,  t  )XY)<°°]  =  1  for 


Pr[N([t0,t)XY)«»]  =  l-Pr[N([t0,t)XY)  =  «>] 


(136) 


finite  t 


(137) 


For  the  analytical  model  description  under  consideration 


Pr[N([t0,t)XY)  =  n] 


Pr[N([to,t)XY)<«]  =1-0=1 


(140) 


The  proof  is  complete. 


D 


We  have  shown  that  U  is  regular  given  u)  Eft  .  The 
following  theorem  shows  that,  for  the  analytical  cross 
product  space  model  given  above,  the  space-time  point 
process  V  is  regular.  With  this  condition  satisfied,  we 
know  that  a  hazard  function  exists  for  V  and  it  can  be 
evaluated  as  in  equation  (87). 


Theorem  III-8.  Let  the  probability  spaces 
and  (ft ,  B  ,P(  •  ;u)g) )  satisfy  Theorem  III-6. 


<ns>As-Ps> 


If 


1.' 


r« 


x  o* 

» 


(a) 


E{An(t,r;w;cjs2  )}’ 


!Xn(t’r>=  / 

nxa 


An(t,r;aj;ws2  )P>(dcoXdoJS2  )<‘ 


(141) 


and 

(b)  A(t,r;o>;w  )  is  measurable  with  respect  to 
s 

[t„,t)  X  Rm  X  fl  X  fig 

then  the  space-time  point  process  V  is  regular. 

m 

Proof.  To  prove  Theorem  III-8,  we  must  show  that  the 
analytical  description  of  the  cross  product  spaces  satisfies 
Theorem  III-l  (equations  (83),  (84),  and  (85)). 

(a)  To  show  that:  If  Be 5  and  te[to,T)  then 

P(B;u>  1 8.  )(w)  is  measurable  with  respect  to  the  product 

s  t 

sigma  field  Ag©B 

Consider  the  function  (mapping): 


i 
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I 


where 


P:flX£i  +[0,l]eR1 
s 


P(B;u)s|  St)(aO 


is  a  set  function  which  maps  from  the  cross  product  space 
into  a  closed  interval  on  the  real  line.  The 
function  P  is  measurable  with  respect  to  Ag®B  if  the 
set 


{BXus:P(B;u>s|Bt)(uO>a} 


is  measurable  for  all  cte[0,l]  (Ref.  36:65).  Since  it 

is  given  that  BsB  and,  by  definition  of  the  probability 
space,  w  eQ  (and  therefore  uk  eA_),  BXw_  is  measurable 

b  b  S  5  S 

with  respect  to  A  ®8  (Ref.  32:71).  Thus 

s 

P(B;ws| St)(u)  is  measurable  with  respect  to  Ag®8 
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(b)  To  show  that:  For  each  point  ( t , r )e [t 0 , T)XY 


//  /Xs(t»riu:u 


(144) 


+a  o  (t ;  u)g  3  )  y*Xn(t,r;U;a»S2)PS4(duS2)  PS3  ( <^s 3  )p( dm  ) 


If  X  (t,r;u>;w  )  is  Gaussian  shaped  as  previously 

S  S  i 

described,  then 


J  Xs(t,r;u);oJsi)pSi(du)Si)  =  Xs(t,r;m)  <  °°  (145) 


It  is  given  that  X  (t,r)  <  °°  .  Since  a0(t;u>  )  and 

n  S3 

ai(t;w  )  have  a  maximum  value  of  one,  we  can  bound  <p(t,r) 
s  3 
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(146) 


<j>(t 


.r>  -f> 


As(t,r;o))P(do))  +  Xn(t,r) 


The  condition  is  satisfied. 

(c)  To  show  that:  4>(t  ,r  ;oj;ojg)  is  measurable  with 
respect  to  [to,t)  X  Rm  X  B  X  A 

o 

This  condition  is  given  in  the  statement  of  the 
theorem. 

The  proof  is  complete.  P 

Condition  (c)  for  this  proof  was  given  in  the  statement 

of  the  theorem.  This  was  done  to  keep  the  form  of 

A(t,r ; oj ; oas )  as  general  as  possible  to  admit  a  large  class 

of  functional  forms.  In  order  for  this  theorem  to  be  of 

practical  value,  hov  or,  some  discussion  on  this  point  is 

in  order.  A  rate  ^rameter  function  A(t,r;w;u)  )  is 

measurable  on  [to,t.  X  Rm  X  8  X  A  if  it  is  measurable 

s 

individually  with  respect  to  each  of  its  arguments  when  the 
other  three  arguments  take  on  any  allowable  value. 


Therefore,  we  can  consider  the  arguments  individually 
providing  the  value  of  one  argument  does  not  affect  the 
measurability  of  the  function  with  respect  to  another 
argument.  From  measure  theory  (Ref.  18:284-287,  for 
example)  we  know  that  continuous  functions,  the  sum  of  two 
measurable  functions,  the  product  of  two  measurable 


functions , 

and  limits 

of 

measurable  functions  are 

all 

measurable 

functions . 

By 

using 

these  results,  we 

can 

construct 

a  large  class 

of 

useful 

measurable  X(t,r;oo 

!“s> 

functions  from  elementary  functions. 

I I I. 6  Summary 

In  this  chapter,  the  basic  physical  model  described  in 
Chapter  II  is  recast  in  terms  of  a  doubly  stochastic  space- 
time  point  process  defined  on  a  cross  product  of  two 
probability  spaces.  Several  analytical  tools  are  presented 
which  allow  inference  of  useful  information  from  the 
observed  doubly  stochastic  space-time  point  process.  In 
particular,  if  the  observed  process  is  regular  (Definition 
III-5),  then  its  hazard  function  exists  and  the  existence  of 
the  sample  function  density  is  guaranteed.  Also,  a 
representation  theorem  is  presented  which  gives  us  the  means 
of  calculating  the  a  posteriori  probability  of  an  event  in 
the  unobserved  (ft  ,A  ,P  )  probability  space  given 

observation  of  events  in  (ft,8,P)  .  Finally,  an  analytical 


description  of  the  particle  beam  estimation  problem  is 
given  in  terms  of  a  cross  product  model.  The  resulting 
process  V  is  shown  to  be  regular. 

Chapter  II  provided  the  basic  form  of  the  multiple 
model  adaptive  estimator  suitable  for  this  point  process 
problem;  however,  the  filter  weights  were  difficult  to 
calculate  from  a  probability  density  function  approach.  The 
methods  and  results  of  Chapter  III  provide  a  means  of 
calculating  the  individual  filter  weights  when  the 
analytical  model  meets  the  regularity  conditions  necessary 
for  Theorems  III-7  and  III-8.  In  Chapter  IV,  an  analytical 
model  appropriate  to  the  particle  beam  problem  is  presented, 
the  individual  filter  weighting  factor  expression  is 
evaluated,  and  the  multiple  model  adaptive  estimator 
equations  are  presented. 


IV.  The  Estimator 


IV. 1  Introduction 

In  this  chapter,  the  full  scale  multiple  model  adaptive 
estimator  for  the  particle  beam  problem  is  developed.  The 
first  section  develops  the  a  posteriori  filter  weights. 
The  general  form  of  the  multiple  model  adaptive  estimator  is 
shown  in  Chapter  II,  equation  (24):  an  estimate  is 
generated  as  the  probabilistically  weighted  sum  of  outputs 
of  individual  estimators,  each  based  on  a  specific 
assumption  about  uncertain  parameters.  The  same  form  in 
notation  suitable  for  the  point  process  problem  is  given  in 
equation  (36).  To  derive  the  a  posteriori  individual  filter 

Nt 

weights,  Pr[h^  | 8 t ]  ,  j  =  0,1,2,  ...  Nt~l  ,  the 
representation  theorem  from  Chapter  III  is  used.  Section 
IV. 3  presents  the  equations  for  the  general  multiple  model 
adaptive  estimator  for  the  particle  beam  problem  in  cross 
product  space  terms.  In  this  section,  the  estimator 
development  is  consolidated  in  one  location  for  reference. 
Finally,  Section  IV. 4  presents  examples  of  t!  ~  estimator 
equations  under  several  simplifying  assumptions.  We  begin 
by  deriving  the  weighting  factors  of  equation  (36). 
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IV. 2  The  Multiple  Model  Weighting  Factors 

The  purpose  of  this  section  is  to  evaluate  the 

Nt 

weighting  factors,  Pr[h.  |  S .  ]  ,  for  the  multiple  model 

J  t 

adaptive  estimator  described  in  equation  (36).  In  words,  a 

weighting  factor  is  the  conditional  probability  that  the 

th  Nt  Nt 

j  hypothesis  h.  ,  out  of  a  possible  2  possible 

J 

hypotheses,  is  the  correct  one,  given  the  sub-sigma  field  of 
events  up  to  time  t.  The  representation  theorem,  Theorem 
III-6,  gives  us  the  means  of  calculating  the  probability  of 
an  event  AeA  given  g  .  If  we  let  the  event  a  be 

s  1  i 

the  event  associated  with  the  hypothesis  sequence  h.  v  , 

J 

then  from  equation  (95)  we  have 


fL  t(u'ws)ps(dws) 

Pr[h  Nt|B  ]=  - 

J  /  Lt<w;ws>Ps(dws> 


(147) 


where  Lj.  (<*>;<*>)  is  defined  by  equation  (93). 

Since  we  are  using  a  cross  product  of  three  probability 
spaces  to  model  the  observed  event,  L.(u;co  )  is  actually 
dependent  on  the  events  in  all  three  probability  spaces: 
^t^'^Si  ’ws2  ’ws3^  •  Tde  hazard  functions  are  assumed  to 
be  of  the  form 
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(148) 


+a0(t,r;wS3  )0n(t,r;to;wS2  ) 


Note  that  this  form  is  very  similar  to  that  shown  in 
equation  (102);  however,  for  the  development  of  the 
weighting  factors,  we  do  not  need  to  assume  that  the 
observed  process  is  Poisson  nor  do  we  need  to  assume  the 
Gaussian  shaped  intensity  profile  for 


4>s(t  ,r;w;w  ) 

We  only  require  that  the  process  V  be  regular  (that  is,  it 
satisfies  Theorem  I I 1-8). 

N, 

The  events  A.,  j  =  l,  ...»  2  are,  by  construction, 

J 

mutually  exclusive  and  exhaustive,  thus 


(152) 


=  JJ 


2  JL  t(“;aJs 
j  =  1  A 

j 


:  u» 


S2 


I  to  ) 

S3 


•  P  (doj  )P  (dto  )P  (dto 

S3  S3  S2  S 2  Sj  S ] 


Lt(ui;u)s)Ps(d:V 


In  equation  (153),  the  A.  notation  is  used  to 

J 

integration  over  n  Xfi  XA .  ,  therefore 

si  s2  3 


N 

y  f  L, (tojto  )P  (du 


(153) 


indicate 


as  expected. 


By  equation  (94),  the  denominator  of  equation  (147)  is 


(157) 


N< 


=  n  [4>s(ti,ri;oj)+4>n(ti,ri;a))] 
i=l 


•  exp 


;  w)d^.dT 


to  Y 


where  the  last  step  is  clue  to  the  fact  that  there  is  no 
conditioning  on  therefore 


a0(t,r;w  )  =  aj(t,r;u>  )  =  1 

03  3 


(158) 


We  must  next  evaluate  the  numerator  of  equation  (147) 
to  obtain  the  weighting  factors.  The  numerator  can  be 
expanded  as 


J  Lt(“’“s)Ps(d“s) 

A. 

J 


I 


K 


Ij(“s,)Lt(“;“s,:“s2;“s,) 


8*. 

Si  S  2  S3 


P  (dco  )Pr.  (dw  )P„  ( dw  ) 

S3  S3  Sj  S2  Si  Si 


where  )  is  an  indicator  function  defined  by 

J  3 


j  ®  3 


“s,  c  Aj 


otherwise 


We  can  rewrite  the  integrand  as 


Ij<“sJ)Lt(“;“s>  =  Lt(“;“s> 


=  "  't’s(tk-rk;“:“s1)"  +n(Vr<l;“;“s2) 


Sj 


N. 

J 


o  t 


•  exp 


-f yi(T  ,^;co;ws)dA.dT 


to  Y 


where  the  vertical  bar  is  read  as  "restricted  to  A. 

J 

set  of  signal  indices  is  defined  as 


Nt 

=  {k:h  .  t(k)-l}  =  {k3 ,k: 


kq  } 


(161) 

the 


(162) 


and  the  set  of  noise  indices  is  defined  as 


In  order  to  evaluate  equation  (164),  consider  the  process 

V.:fl  Xfi  Xfl  X[t0  ,t)->[t0  ,t)XRm 

J  Si  S  2  S3 

where  there  is  only  one  possible  hypothesis  sequence  of 
signal/noise  observed  events;  that  is 

P  (A.)  =  1  (165) 

s3  J 

v 

The  sequence  h.  t  is  guaranteed  to  occur.  The  process  vi 
J  J 

is  regular  because  it  is  a  special  case  of  the  process  V. 
which  is  regular.  Since  V.  is  regular,  by  Theorem  I I 1-3  a 
hazard  function  exists  which  is  an  expected  value: 


4>j(t,r;w)  =  <)>j(t,r;oj)=Es{$(t  ,r;co;cos)  |  A0&5t}  (166) 


in  which  the  j  subscripts  indicate  association  with  the 
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By  Theorem  I I 1-5,  equation  (94), 


special  case  process 


(168) 


where  Lt  (w;ug)  is  the  sample  function  density  for  Vj  as 
3 

defined  by  equation  (93): 


L  (w;w  )  = 

tj  S 


[-//< 

L  to  Y 


n  4>j(ti,ri;o);uis)exp  -/  /<frj  (x  ;  w  ;u>s)d4dT  (169) 


The  sure  event  A.  determines  the  value  of  the  coefficients 

J 


I 
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ot0(t,r;u)  ) 

3 


and  a i  (t ,  r  ;<a)  ) 

3 


therefore 


“  "  Wrk;“;"s1)ri  ‘t’n(t«.’:>V“;“s2) 


Sj 


N. 

J 


exp 


4>j  (t  ,A;aj;tos 


)d^dx 


to  Y 


(170) 


We  can  substitute  equation  (170)  into  equation  (1G8)  to 
obtain 


L  (w) 


■/ 


(2 


Jt.(u;“s)Ps(dws) 

J 


s 


(171) 


f  f  "  ♦s<W“:"s,>Ii  > 

O  XT 


s 

j 

Si  s2 


r  t 

•  exp 

*-  t0Y 


a)  :  u)  )  d-tdx 
Si’  S2y 


•  P  (du  )P  (do)  ) 
s2v  s2'  Si  Sl 


(172) 


where  we  have  used  the  property  that  the  integral  term  in 
the  exponential  of  equation  (172)  is  independent  of  w 

55  3 

because  the  a  coefficients  of  4>(*)  only  have  a  value 
of  zero  at  points.  Equation  (172)  is  exactly  the  expression 
we  need  to  evaluate  in  equation  (164).  By  equation  (168), 
we  know  that 


i 


(w ) 


=  n  ♦  <i)n(t£,r£;a))exp 


S3 


N. 


-//id  ,.t; 


;to)d4dx 


L  to  Y 


(173; 


We  can  substitute  equation  (173)  into  equation  (164)  to 
arrive  at  the  numerator: 


f  Lt(“;“s)Ps(d“s) 


=  n  ^s(tk,rk;aj)n  tPn(t£»r£'aj)exp 


sj 


N  . 
J 


-If 


$(t  ,^t;u))didT 


L  to  Y 


(174) 


Substitution  of  equations  (174)  and  (157)  into  equation 


(147)  results  in  the  desired  weighting  factor 


4 


n  $n(t&>rZ;U}) 


I 


N 


.  S  . 

tlBl  =_i 


Pr[h  u|Bt]  = 


Nt 

n 

i=ls 


,(ti,ri;w)+<tln(ti,ri;a))j 


9 


From  this  expression  for  the  weighting  factors,  it 
seen  that 


<  n 


« 


as  discussed  in  the  text  preceding  equation 
Furthermore ,  the  weighting  factors  are  recursive: 


« 


i 
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(175) 


can  be 


(176) 


(155). 


I 


Pr  [h. 


Nt  ]  _  h,j  t<Nt)»s(tNt-rN,;“)  +  )1-hit(Nt)l,»n(tN,-rNt-“> 


( tNt  *  rNt » w )  +  (}ln ( tNt  ’  rNt ;  w  J 


Pr 


N  -i 

V  iBt 

J  r(Nt-l) 


for  t  £t*t  (177) 

XNt  Z  Nt+1 


where  the  j '  notation  denotes  the  "old"  sequence  which  is 

Nt 

concatenated  with  the  new  sequence  value,  h.  (N,  )  ,  to 

Nt  J  t 

obtain  the  "new"  sequence  h.  .  This  is  the  same  notation 

J 

discussed  in  more  detail  in  the  text  following  equation 

(51). 

Nt 

We  now  have  the  weighting  coefficients  Pr[hj  l^] 
for  use  in  the  multiple  model  adaptive  estimator  equation 
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(36).  The  evaluation  of  the  coefficients  is  general  because 
we  did  not  need  to  specify  the  exact  form  of  the  hazard 


function.  It  v/as  only  necessary  that  the  observed  process 
satisfy  Theorem  III-8.  In  the  next  section,  all  the 

r  equations  for  the  multiple  model  adaptive  estimator  for  the 

particle  beam  problem  are  consolidated  and  presented. 


rc  o* 


IV. 3  Estimator  Equations 

In  this  section,  the  results  of  the  previous  section 

and  chapters  are  consolidated  and  the  multiple  model 

adaptive  estimator  equations  for  the  particle  beam  problem 

are  presented.  Only  the  results  are  listed  here; 

references  to  earlier  chapters  and  sections  are  provided  for 

detailed  descriptions  and  derivations. 

IV. 3.1  Assumptions.  A  conditionally  Poisson  space- 

time  point  process  is  observed  on  [t0,t)XRm  .  The  observed 

process  consists  of  point  process  events  from  a 

conditionally  Poisson  signal  process  with  rate  parameter 

X  (t.rjujw  )  and  a  conditionally  Poisson  noise  process 
o  S  j 

with  rate  parameter  X  (t,r;w;w  )  .  The  signal  and  noise 

n  S2 

processes  are  assumed  independent,  therefore  the  rate 
parameter  for  the  observed  process  is 


The 


IV. 3.2  The  Multiple  Model  Adaptive  Estimator, 
multiple  model  adaptive  estimate  of  x(t;ws^)  is  given  by 


Nt 

2  -1 


x(t  ;cd  ) 


-  N 

=  Z  xJ(t)Pr[hj  t|8t] 


(181) 


where  is  the  number  of  observed  events  in  [to,t)XR 

Tlie  development  of  the  MJ1AE  form  is  presented  in  Chapter  II. 

A 

The  individual  hypothesis  estimates,  x^(t),  are  defined  as 


A  A  Nt 

Xj(t)  =  E{x(t;wSi)|8t,hj  > 


(182) 


where  G,  is  the  sub-sigma  field  of  events  up  to  time  t 

Nt  th 

and  h.  is  a  hypothesis  sequence  which  defines  the  j 

J  Nt 

entry  out  of  a  list  of  2  possible  hypothesis  sequences 
at  time  t.  The  hypothesis  sequence  notation  is  defined  in 
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detail  following  equation  (28). 

The  process  x(t)  is  modeled  as  the  output  of  a  linear 
system  driven  by  unit-strength  white  Gaussian  noise.  We 
observe  a  conditionally  Poisson  point  process  which  has  a 
Gaussian  shaped  rate  parameter.  As  a  result  of  this  model, 
we  can  use  the  Snyder-Fishman  filter  to  calculate  the 
individual  estimates. 

The  general  form  of  the  Snyder-Fishman  filter  equations 

A 

used  to  calculate  the  x.  terms  is  given  by  equations  (10- 

J 

13).  Since  the  hypothesis  sequence  h.^*  is  given  (in 

J 

A 

equation  (182)),  the  estimate  for  x.  is  calculated  by 

•J 

considering  only  those  observed  events  which  are  caused 
Nt 

(according  to  h^r )  by  the  underlying  signal  process.  Thus, 

Nt 

for  each  of  the  2  1  models,  we  calculate 


and 


d£,(t)  =  F(t)Z  .(t)dt+|.(t)FT(t)dt+G(t)G(t)T(t)dt 
— J  “J 


(184) 


-(t)i 


.  (t)N(dtXd^) 


C 


where  the  counting  integrals  in  equations  (183)  and  (184) 

only  have  nonzero  value  at  the  measurements  (t,r)  which 

^t 

are  specified  by  hj  to  have  been  caused  by  the  signal 

process . 

Equations  (183)  and  (184)  can  also  be  written  in 
propagate-update  form,  as  is  commonly  done  for  Kalman  filter 
realizations.  Let  t  denote  a  time  immediately  prior  to 
an  observed  point  process  event  and  let  t+  denote  a  time 
just  after  an  observed  point  process  event  and  its 
incorporation  into  the  estimate.  The  estimate  of  x(t)  is 
propagated  by 


dx.(t)  =  F(t)x.(t)dt  {+  c'  if  used} 
J  J 


! 


for: 


< 


i 


1,2 


Updates,  due  to  observed  point  process  events 
accomplished  by 


! 


~  + 

xj<V 


=  xj(tJ)+K,(tJ)(ri-H(t:)xj(t")) 


l  1,2,  •••,  Ni 


(185) 


are 


(186) 


The  propagate  form  for  the  individual  covariances  is 


(189) 


r« 


re  o* 


K.(t)  =  E(t)HT(t)[H(t)m)HT(t)+R(t)]’‘ 
•»  J  J 


Note  that  we  could  use  another  (perhaps  nonlinear) 
model  for  x(t)  and  an  arbitrary  rate  parameter  for  the 
conditionally  Poisson  signal  process.  The  general  form  of 
the  MMA  estimator  (equation  (181))  is  unchanged;  however, 
the  individual  estimates  of  x(t)  would  have  to  be 
rederived  for  the  new  model.  The  Gauss-Markov  nature  of 
x(t)  and  the  Gaussian  shaped  rate  parameter  allow  us  to  use 
the  convenient  Snyder-Fishman  filter. 

The  weighting  factors  in  equation  (181)  are  generated 


as 


N. 


nXs(tk’rk;u,)  *  n  Xn(t£’r£;w) 


PrChj  l|Bt] 


N. 


N 


i=l 


Xs(ti'ri;u)  +  x 


n(ti'ri'w> 


(190) 


where 


*  _  A 

Xs(t,r;w)  =  Eg 


^s(t,r;o);ws 


(191) 


and 


V) 


An(t,r;a)) 


Vt-r;“;“s2 


(192) 


These  weighting  factors  are  developed  in  Section  IV. 2. 

From  equation  (27),  the  covariance  of  the  estimator  is 


~  V  N. 

ut)  =  2  Pr[hj  iBt] 

j=0 


(193) 


£j(t)+[Xj(t)-x(t)][Xj(t)-X(t)] 


T 
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K  O 


where  the  individual  covariances  are  given  by  the  solution 

Nt 

to  equation  (184)  for  j  =  0,1,  ...  ,2  -1 

The  equations  in  this  section  specify  the  multiple 

model  adaptive  estimator  for  the  (very  general)  form  of 

X  (t.r:oj;u)  )  given.  The  following  section  describes 

n  s  2 

several  examples  of  the  estimator’s  equations  for  various 

assumed  forms  of  x  and  x 

s  n 

IV. 4  Examples 

The  following  examples  provide  insight  into  the 
structure  of  the  estimator  and  a  sample  description  of  the 
model  to  match  a  simple  particle  beam  tracking  application. 

IV. 4.1  Example  a:  Structure  Insight.  In  this 

example,  we  see  that  the  weighting  factors  defined  by 
equation  (190)  reduce  to  equation  (67)  when  the  same 
assumptions  are  made. 

For  the  development  of  equation  (67),  it  is  assumed 
that  both  the  signal  and  noise  rate  parameters  are 
noni'andorru 

Xs(t,r,x(t ;wSi ) ;w)  =  Xg(t,r,x(t))  (194) 
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I 


where  x(t)  is  known,  and 


Xn(t,r;a);aJS2  )  =  Xn(t,r) 


(195] 


We  can  substitute  these  values  into  equations  (191)  ar 
(192)  to  obtain 


Xs(t,r;w)  =  Xs(t,r)  (196 


and 


Xn(t,r;w)  =  Xn(t,r)  (197 


We  can  substitute  equations  (196)  and  (197) 


in 


equation  (190)  to  obtain 


(198) 


n  WVx(tk”  n  Xn<W 


N, 


Pr[h  t|8t]  = 


_  3 


Ni 


N. 


.ni|is(t1,r1,x(ti))+xn(ti,ri)  | 


Equation  (198)  is  the  same  as  equation  (67).  Thus,  for  the 
trivial  case  of  known  ^c(t,r,x(t))  and  X  (t,r) 
the  full  scale  estimator  reduces  to  the  same  form  as  in  our 
initial  multiple  model  adaptive  estimator  development. 

IV. 4. 2  Example  b:  Tracking  Model.  In  this  example, 
we  present  parameters  for  the  model  which  are  suitable  for  a 
simple  tracking  application.  The  problem  is  to  estimate  the 
position  of  a  circularly  symmetric  Gaussian  intensity 
profile  on  a  two  dimensional  detector  array  based  on  the 
observation  of  space-time  point  process  events.  There  is  no 
feedback  control  in  this  example. 


146 


Let  the  point  process  be  observed  on  [to,t)XR2  where 


n  =  2 

dimension  of  *(t) 

m  =  2 

dimension  of  observed 

spatial  vectors. 

As(t,r;w;us^ )=*s(t,r;w 

c  ) 

1 

H(t)  =  H  =  I. 

constant 

I  =  identity  matrix 

> 

/-v 

C+ 

II 

> 

constant 

R(t)  =  R  =  y2I_ 

Y 

scalar  constant. 

i 

a 

/•N 

G(t)  =  gl 

g 

scalar  constant 

u(t)  =  two  dimensional 
A„(t,7;u.;<osj).Xn(7)=  j 

Y/einer  process  of  unit 

diffusion 

A  reyCp2 

n 

1 

0  elsewhere  (199) 

where  Y  represents  a 
two  dimensional  photo¬ 
detector  array  as  in 
Figure  7. 

A 

x(0)  -  V  initial  conditions 

Z(0)  =  z0 

The  vector  x(t;wgl)  is  a  two  dimensional  vector  output 
of  the  linear  system 


The  signal  rate  parameter  is 


X  (t,r;io  )  =  Aexp{-4[r-x(t)]  —  l[r-x(t)]} 

s  si  y2  —  ~ 


(201) 


Thus,  the  point  of  maximum  intensity  on  the  detector  array 
is  x(t)  and  our  goal  is  to  estimate  x(t)  given  a  sequence 
of  point  process  observations  of  the  form  (t,r)  which  are 
corrupted  by  point  process  noise. 

The  noise  rate  parameter  is  selected  to  model  noise 

from  dark  current  electron  emissions  in  a  physical 

photodetector.  Because  of  the  finite  nature  of  Y,  X  is 

n 

integrable  on  R2  (as  required  in  Chapter  III)  and  there 
will  be  no  dark  current  induced  noise  events  outside  of  the 
limits  of  Y  . 

Although  this  description  of  Xn(t  ,r)  models  the 
finite  size  of  a  physical  detector,  notice  that  X  (t.Fjw  ) 
is  defined  for  all  reRrn  throughout  the  development  of  the 
estimator.  If  we  simulate  this  tracking  problem  (on  a 
computer  for  example)  we  must  consider  this  fact.  One 
method  for  dealing  with  this  is  to  make  the  size  of  Y  very 
large  with  respect  to  the  "spot  size"  of  the  signal 
controlled  by  R  .  When  this  is  done,  we  can  insure  that 
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there  will  be  an  acceptably  small  probability  of  observing  a 
signal  event  where  there  is  no  detector  array.  Note  that 
this  modeling  concern  does  not  change  the  estimator's 
equation;  it  only  affects  how  accurately  simulation  results 
can  be  applied  to  a  real  world  system  and  how  closely  any 
calculated  error  statistics  match  the  real  world  system 
performance.  A  second  method  is  not  to  allow  any  "signal" 
point  process  observed  events  outside  of  the  subspace  Y. 
This  would  involve  a  modification  of  the  individual 
estimators  to  include  the  edge  effects  from  the  disallowed 
regions . 

The  expression  for  the  estimator  is  given  by  equation 

A 

(181)  and  the  individual  hypothesis  estimates,  x.  ,  are 

J 

given  in  differential  form  for  j=0,l,2,  ...  ,N  -1  by 

L 


dx j ( t ) 


f  _  sl  _ 

-x  (t)dt  +  /  K.(t)[r-x  .]N(dtXd4) 
j  J  J  J 

R2 


(202) 


d£  (t)  =  -2Z  (t)dt+g2Idt- /K  (t)£  (t)N(dtXdl)  (203) 

J  J  J  J  J 


R' 


K  *5 


A  A 


Kj(t)  =  ZJ(t)Uj(t)+y2j[3  -1 


(204) 


The  weighting  factors  are  given  by  equation  (190)  where 


As(t, 


-  ,  (s  8t)05 


(205) 


When  the  first  point  process  event  is  observed  at 

(t  , r  )  ,  we  have 

i  i  ' 
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This  is  expected  since  there  are  only  two  possible 
hypothesis  sequences  at  time  tx  and  one  or  the  other  must 
be  correct. 

At  this  point,  it  is  useful  to  introduce  a  superscript 

A 

for  x  which  is  similar  to  that  used  for  the  hypothesis 

-  Nt 

sequence  notation.  We  let  x.  (t.,*)  be  the  individual 

_  J  N 

estimate  of  x  at  time  ti ,  assuming  that  h^  is  the 

correct  sequence.  The  superscript  specifies  that  this  value 

A 

for  x  is  evaluated  after  incorporation  of  the  observation 

numbered  N.  .  Note  that  j  can  take  on  values  in  the  set 
Nt 

0,1,2,  ...  ,2  -1  and  that  i-N^ .  The  same  superscript 

can  be  assigned  to  the  individual  covariance  values: 
~  Nt 

2-  (t).  This  additional  notation  is  necessary  because  the 

growing  number  of  hypothesis  sequences  ctuses  the  subscript 

indices  to  change  as  each  new  data  point  is  observed.  This 

can  be  seen  in  the  rest  of  this  example. 

With  this  notation,  the  individual  estimates  for  x0 

c.1 

and  xi  (in  differential  form)  are  given  by 


£.1 

dx  o ( t )  = 


-  x  o  ( t  )  dt 


(209) 


to  -  t  -  tT 
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c 


[ 

A  1  1 

dx!(t)  =  -Xidt  (210) 

I  t0  -  t  -  t7 


I 

xj(tt)  =  xJctD+KiCtOfr^xI  (,t7)]  (211) 


1  <9 


~ i 

d£ i ( t )  = 


~  i 


-2£i  (t)dt+gzj[dt 


(212) 


t  o  -  t  -  1 7 


a  i  +  «i 


£i(ti)  =  £i(t1)-K1(t1)E1(t1) 


(213) 
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K  i  ( t )  =  Z1i(t)[s!(t)+Y2l]"1 


(214) 


Note  that  we  did  not  need  to  calculate  d_E0(t)  explicitly 

A 

for  the  estimate  of  x(*)>  although  it  is  necessary  for 
calculation  of  the  overall  covariance  of  the  estimator. 

When  the  second  point  process  event  is  observed  at 
(t2,r2)  ,  the  weighting  factors  are 


,  X  (ti ,n)X  (t2 ,r2) 

Pr [h2  j  8.  ]  =  -5 - 2 - 

0  z  A 


(215) 
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(216) 


Pr[hI|Bt]  = 


Anlti,ri)  As(t2,r2) 
A 


Pr[h2 | Bt ] 


As(ti,r:)  An(t2,r2) 
A 


(217) 


re  o* 


Pr[h3|Bt] 


^s(ti  ,*i  )As(t2  ,r2  ) 
A 


(218) 


where 


< 


A  =  n  ^S(ti»ri)  +  An(ti,ri)] 
i=l 


(219) 
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Note,  again,  that  the  weighting  factors  sum  to  one. 

This  example  displays  how  the  MMAE  algorithm  progresses 
as  each  new  space-time  point  process  event  is  observed.  In 
general,  to  advance  from  time  t*  to  time  ti+1+  tie 
estimator  must: 


P 


(a)  For  j  =  21-!  down  to 


(i) 

Propagate 

to 

x  Nt(t 

2j  ui+l  ; 

(2) 

• 

Propagate 

to 

E  Nt(t  ") 

— 2 j  ui+l  > 

(b)  For 

j  =  2i+1-l 

down 

to  0: 

If 

j  is  even, 

then  (it 

was  a 

noise  event) 

(1) 

Let 

ii 

oX|> 

Nt(t"  ) 
^i+l' 

(2) 

Let 

A. 

~  “j 

Nt(f  ) 

1  i+l; 

(it 

was 

a  signal 

(3) 

Let 

A 

Xj 

Nt(t" 

-1  ^  i+l 

(4) 

—  ^t(t“  ) 

Update  x.  k i+l' 

J 

~  N 

to  x.  t( 

(5) 

Let 

E  Nt(t-  ) 
-j  Ui+l' 

A 

=  -j 

Nt(t" 

-1  ui+l 
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(6)  Update  £  (ti+l)  to  Lj  t(ti+l> 


(Note  the  changing  subscripts  and  the  doubling  of  the  number 
of  active  filters) 

(c)  For  j  =  0  to  2*  -1: 

Calculate  Pr [hj 1+1 | &t] 

A 

(d)  Calculate  x(ti+^) 


Figures  8  and  9  depict  the  storage  requirements  and 
data  movements  of  the  x  vector  for  one  iteration  of  the 
above  algorithm.  Figure  8  shows  the  doubling  of  the  memory 
required  as  the  state  of  each  elemental  filter  is  propagated 
to  the  next  event  time  in  step  (a).  In  Figure  9,  the  update 
phase,  step  (b),  is  shown.  The  elemental  covariance 
matrices  are  propagated  and  updated  in  exactly  the  same 
manner. 

The  complexity  of  the  expressions  and  the  expanding 
number  of  individual  filters  strongly  motivate  the 
simplifications  to  the  estimator  discussed  in  Chapter  V. 


IV.  5  Summary 

In  this  chapter,  we  have  developed  the  weighting 
Nt 

factors,  Pr[h  |B  ]  ,  for  the  multiple  model  adaptive 

J  « 

estimator  presented  in  Chapter  II.  The  development  is 
possible  because  of  the  cross  product  space  modeling 
concepts  and  results  of  Chapter  III,  particularly  the 
representation  theorem.  The  multiple  model  adaptive 
estimator  equations  are  consolidated  in  this  chapter  and 
several  examples  are  given  to  provide  insight  into  the 
estimator's  structure  and  calculations.  Simplifications  to 


the  estimator  are  discussed  in  the  next  chapter. 


L6J 


V.  Filter  Simplifications 

V.l  Introduction 

The  multiple  model  adaptive  estimator  developed  in  the 
previous  chapters  provides  the  minimum  mean  square  error 
estimate  of  a  vector  x  (which  is  not  directly  observable) 
from  observations  of  a  point  process  signal  in  additive, 
independent  point  process  noise.  The  estimator  consists  of 
elemental  filters,  the  number  of  which  grows  exponentially 
with  each  new  observed  space-tine  event.  In  addition,  the 

A  A 

calculation  of  the  and  \n  terms  of  the  weighting 
probabilities  (equation  (175))  is,  in  general,  complex. 
These  two  factors  motivate  the  simplifications  developed 
here. 

In  this  chapter,  we  develop  simplifications  to  the 
multiple  model  adaptive  estimator  which  require  less 
computation  than  the  methods  of  the  full  scale  estimator. 
The  simplified  filter  does  not  have  a  growing  requirement 
for  either  memory  or  computation  as  additional  space-time 
events  are  observed.  The  simplifications  result  in  a 
suboptimal  filter;  however,  they  are  based  on  an 
understanding  of  the  structure  of  the  full  scale  estimator 
and  a  brief  discussion  is  given  of  the  conditions  under 
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<  *>• 


which  these  simplifications  are  appropriate. 

Two  areas  are  considered:  calculation  of  the 

probability  weighting  factors,  and  limitation  of  the 
exponential  growth  of  the  number  of  elemental  filters.  We 
begin  by  considering  calculation  of  the  weighting  factors. 

V.2  Weighting  Factor  Simplification 

The  form  of  the  general  weighting  factors  as,  shown  in 
equation  (175),  is  very  straightforward.  The  key  to  the 
complexity  (or  simplicity)  of  calculation  is  in  the 

A,  /V 

evaluation  of  the  terms  <b  and  <J>  , 

s  n 

A 

The  term  d>  is  defined  as 
s 


<!>s(t,r;w)  =  Es{4>s(t,r;u);ws)  iBt> 


(220) 


I 


and  <J>n  is  defined  in  a  similar  manner.  The  difficulty  in 
calculating  these  two  terms  is  highly  dependent  on  the  model 
of  the  underlying  physical  process.  At  one  extreme,  if  we 
make  the  assumption  that  observed  process  is  Poisson  with 
known  signal  rate  parameter 


4>s(t,r;oj;u)s)  =  Xg(t,r,x(t)) 


(221) 


and  if  x(t)  is  known,  then 
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(222) 


<|>(t,r;co)  =*  Ag(t,r,x(t)) 

because  <t>  is  not  dependent  on  oj  .  At  the  other 
s  s 

extreme,  we  could,  conceivably,  formulate  a  model  for  which 

A  A 

there  is  no  closed  form  solution  for  <p  and  6 

s  n 

A  more  useful  case  is  that  of  the  particle  beam  problem 

where  A  is  defined  by  equation  (3)  and  A  is  a 

®  n 

deterministic  function  of  t  and  r.  (A  random  noise  rate 
parameter  is  allowed  and  useful  for  the  beam  problem.  We 
are  given  the  form  of  A  and  there  are  significant 

S 

A 

difficulties  in  computing  A  .  Because  we  have  considerable 

o 

freedom  in  the  form  of  Aq  ,  only  simplifications  to  the 

A 

Ag  calculation  are  considered  here.) 

Additionally,  let  there  be  no  feedback  for  this 

example.  Since  A„  is  not  random 

n 

An(t,F)  =  An(t,r)  (223) 

and  we  must  evaluate 


yt'r> 


*s(t,r,e)f(5 


(224) 


where  £  is  the  dummy  variable  for  x(t)  and  f(£|zNt) 
is  the  probability  density  function  of  x(t)  given  the 


measurement  history.  We  can  use  Bayes'  rule  to  obtain 


_  N 

f (C| z  Z)  = 


N 


f  (Z 


t\Z)f(X) 


f (ZNt) 


(225) 


Nt 

in  which  f(Z  )  is  the  sample  function  density  of  the 
observed  process  and  f(X)  is  the  probability  density 
function  of  x(t)  propagated  from  time  t0 

From  the  definition  of  a  sample  function  density,  we 
can  write 


N. 

f(z  tUg) 

f(Z  r) 


=  exp 


-/  yi(T,> t, 


Cg)-«KT  ,4)d4dx 


to  Y 


+ 


t 


I  fla[ 


— - —  -  p— 

<P(t,a) 


N(dx 


X  d4) 


(226) 


where  <j>  =  X  (t,r,£)  +  X(t,r)  and  5  is  the  given  value 
s  n  g 

of  X(t)  in  the  numerator  probability  density  function  in 
equation  (225). 

We  now  make  the  assumption  that  Y=Rm  .  To  satisfy  the 


regularity  conditions,  An(t,r)  must  now  be  integrable 
over  Rm.  This  assumption,  that  Y=Rrn  ,  is  not  too 
restrictive,  however,  because  we  can  usually  redefine 
A  (t,r)  to  be  integrable  over  some  region  of  interest  and 
zero  elsewhere. 

As  a  result  of  this  assumption,  we  can  simplify 
equation  (226)  further.  Consider  the  term 


U  l 

/  J  (4>-4>)dIdx  =  /  y*4>(T,I, 


£  )-<Kt  ,*)d*dT 

D 


K 


m 


to  R 


m 


(227) 


t 

=  J  J  ^(t,a,T  )-Eg{^(T,T,T)|Z  tjdAdr 

,  Rm 


We  can  integrate  the  terms  separately  and  interchange  the 
order  of  integration  and  expectation  by  the  Fubini  theorem 


to  obtain 


//.- 


cjjd'idT 


to  R 


m 


t 

=  f  ,*,Tg)dK-Es  I  A(T,I,I)dIiZNt  j 

t  o  -Rm  Rm 


dr 


(228) 

Because  the  assumed  Gaussian  shape  of  A  (t,r,lf)  results 

s 

in  a  tractable  integration,  we  can  evaluate  the  integral 


/ 


Ht,A,Z)dA 


R 


m 


«i(t;u  ) 

®  3 


/ 


As(t,A,Od*+«o(t  ;w  ) 


R 


m 


'fa 

R 


(  t  ,  A  )  dA 


(229) 


m 

2, 


=  A(t)(27r)  |R(t)|^+ 


f 

Rm 


A  (t  ,^)d^i 


(230) 


where  the  last  step  follows  from  equation  (120)  and  the  fact 
that  no  conditioning  on  a  particular  hypothesis  sequence  is 
given,  therefore 


<*o(t;w  )  =  a  i  (t  ;o>  )  =  1 

S3  S3 


(231) 


2  h 

Note  that  A(t)(2-x)  |R(t)|5  is  not  a  function  of  w  or 
03g,  even  if  feedback  control  is  included  in  the  manner 
previously  described.  For  the  assumed  X 


/vt.*)d*  =  Es{ /  An(t,^.)d^jZNt} 


(232) 


therefore 


//- 


<j>  d*  dx  =  0 


t0  R 


(233) 


and  equation  (226)  reduces  to 


=  exp 


t 

/ 


f  f  In  T  N ( 


dx  X  d-i ' 


c 


From  the  definition  of  a  counting  integral,  we  can  write 
equation  (234)  as 


f  (Z 
f(Z 


im 

Nt) 


n 

i=l 


4>(ti,ri) 


(235) 


►  -i 

X  *>< 


where  the  index,  i  ,  corresponds  to  the  observed  point 
process  events  {z(l),  z(2),  ...  ,z(i),  ...  z(Nt)>  . 

By  substituting  equation  (235)  into  equation  (225)  and 
equation  (224),  we  have 


Nt  ^ 

Xs(t,r)  n  i(t.,r.) 
i=l 


< 


Ag(t,r,C) 


Nt 

n 

i=l 


*(ti,ri,5)f(Od* 


(236) 
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Equation  (236)  is,  in  principle,  solvable,  however  it 
requires  an  increasing  amount  of  calculation  as  more  point 
process  events  are  observed.  The  order  of  the  polynomial  in 

A 

X  (which  must  be  solved  at  each  event  time)  is  jj  +1  and 
s  t 

the  product  of  <J>  terms  in  the  integral  increases  by  one  at 
each  observation  time.  In  addition,  f(£)  is  the  Gaussian 
probability  density  function  for  x(t).  The  covariance  of 
this  density  increases  substantially  as  the  density  is 
propagated  forward  in  time.  This  will  eventually  lead  to 
numerical  problems  in  an  implementation  of  the  estimator. 

/v 

Due  to  these  difficulties  in  evaluating  1  for  this 
example,  consider  the  following  simplification: 


M  A 
To  evaluate  ^s^i,ri^  take  the  estimate,  x(ti_1) 

of  x(ti_^)  and  propagate  it  forward  in  time  to  ti. 

A 

Use  this  propagated  estimate,  x(ti)  to  evaluate 

A 

equation  (3)  to  obtain  a  value  for  Ag 


Two  points  should  be  noted.  Firs. ,  since  equation  (3) 
is  not  linear  in  x(t)  in  general  E{Ag'x(t))}  will  not 
be  exactly  equal  to  Ag(E{x(t)})  ,  and  it  is  the  latter  form 
we  are  using  in  this  simplification.  Second,  since  we  are 
propagating  an  old  estimate  of  to  use  at  time 
t^  ,  it  is  important  that  we  use  an  accurate  propagation 
model.  If  the  assumed  model  of  the  underlying  process 
(equation  (4))  has  characteristics  different  than  those  of 
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the  actual  physical  process,  then  we  would  expect  this 
simplification  to  result  in  poor  performance  due  to  the 
inaccurate  propagation.  We  have  assumed  perfect  knowledge 
of  the  dynamics  of  the  underlying  physical  process 
throughout  this  research.  Thus,  this  simplification  is 
reasonable . 

V.3  Limitation  to  a  Fixed  Number  of  Elemental  Filters 

In  this  section,  we  consider  a  method  of  limiting  the 
exponential  growth  of  the  full  scale  multiple  model  adaptive 
estimator  so  that  (after  the  startup  of  the  filter)  there  is 
always  a  fixed  number  of  elemental  filters.  The  overall 
method  is  to  consider  only  observations  from  a  finite  data 
window. 

Let  D  be  the  "depth"  of  the  algorithm  where  the  depth 
is  the  number  of  the  most  recent  point  process  observed 
events  which  are  explicitly  included  in  the  calculation  of 
the  elemental  filter  estimates.  If  D  events  are 
considered,  then  there  are  2°  elemental  filters  and  the 
number  of  filters  is  constant.  As  each  new  point  process 
event  is  observed,  we  must  eliminate  the  "oldest"  observed 
event  and  incorporate  the  new  event.  Three  methods  for 
accomplishing  this  are  discussed  in  the  following  sections. 


V.3,1  Method  One:  Re-initialization.  The  first 

method,  called  re-initialization,  restarts  the  entire 
multiple  model  adaptive  estimator  each  time  a  new  point 

process  event  is  observed,  which  would  result  in  more  than 

„D 

2  elemental  filters.  The  now  event  is  added  to  the  set  of 
observations  considered  in  the  algorithm  and  the  oldest 
observed  event  is  discarded  so  that  there  is  a  maximum  of  D 
observed  events  explicitly  present  in  the  MMAE  calculation. 

The  algorithm  for  the  re-initialization  method  is  as 
follows.  Recall  that  is  the  number  of  point  process 

events  in  the  interval  [t  ,t)  . 

(a)  For  N  -  D  : 

zi 

Operate  as  a  full  scale  MMAE  filter  as  described 
in  Section  IV. 3. 

(b)  For  N  >  D  : 

(1)  Re-initialize  the  filter  by  letting 

A 

sit  be  the  new  initial  condition  at 

time  ti_D  # 

(2)  Propagate  and  update  the  elemental  filters 
from  time  tj,_Q  to  time  t^ 

A 

(3)  Calculate  x(t^)  as  a  weighted  sum  as 
described  in  Section  IV. 3. 


Even  though  only  the  most  recent  D  observations  are 


I 


I 


P 

m 


I 


I 


explicitly  included  in  the  calculation,  the  older  observed 

A 

events  influence  the  value  of  x(t^_^}  .  Thus  we  fix  the 
number  of  elemental  filters  by  a  data  window  concept  but  the 
entire  measurement  history  is  still  considered  in  the 

A 

calculation  of  x(t^) 

A  major  disadvantage  of  this  method  is  that  the  entire 
set  of  2°  elemental  filters  must  be  propagated  over  D 
inter-event  intervals  and  the  updates  due  to  D  events  must 
be  incorporated  for  each  newly  observed  point  process  event. 

V.3.2  Method  Two:  Strict  Window.  The  second  method, 
termed  "strict  window",  completely  disregards  all  observed 
events  except  the  most  recent  D  events.  This  is 

accomplished  by  assuming  that  all  events  prior  to  the  most 
recent  D  events  were  caused  by  noise.  The  algorithm  for 
the  strict  window  method  is  as  follows: 

(a)  For  -  D  : 

1 

Operate  as  a  full  scale  MMAE  filter  as  described 

in  Section  IV. 3. 

(b)  For  N.  >  D  : 

1 

(1)  Retain  only  the  (2D)/2  elemental  filters 
from  the  lower  half  of  the  hypothesis 
sequence  tree  at  time  ti_1 


3 


**0  o* 

a 


Figure  10.  Strict  Window  Method 


(2)  Propagate  the  (2D)/2  filters  to  time  t.^ 
and  update  them  to  obtain  2IJ  elemental 
filters . 

A 

(3)  Calculate  x(t^)  as  in  Section  IV.  3. 

The  strict  window  method  is  depicted  in  Figure  10  for 
the  sample  case  of  D=2.  The  bottom  hypothesis  sequence, 
ho,  is  that  which  assumes  all  observed  events  are  due  to 
noise.  Because  of  this,  we  do  not  need  to  re-initialize , 
repropagate,  or  re-update  all  2D  elemental  filters  as  each 
new  event  is  observed;  the  calculations  have  already  been 
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performed. 

A  major  disadvantage  of  the  strict  window  method  is 
that  the  "old"  observed  point  process  events  which  are 
outside  the  strict  window  are  completely  ignored.  They  have 

A 

no  effect  on  the  value  of  x(ti)  and  thus  we  are  not 

gaining  information  about  x(t^)  from  the  measurement 

history  from  time  t.  to  time  t .  .  A.  second 

0  1— u 

disadvantage  of  the  dtrict  Window  method  is  that  it  relies 
completely  on  the  filter's  internal  model  of  the  dynamics  of 
the  underlying  process  for  the  time  interval  prior  to  the 
data  window.  This  could  cause  serious  errors  if  the 
filter's  model  were  not  correct. 

V.3.3  Method  Three:  Best  Half.  The  third  method, 
termed  best  half,  fixes  the  number  of  elemental  filters, 
uses  information  about  x(t^)  from  the  entire  measurement 
history  as  in  method  one,  and  retains  the  efficiency  of 
calculation  as  in  method  two.  The  best  half  algorithm  is  as 
follows : 

(a)  For  N,  -  D 
zi 

Operate  as  a  full  scale  MMAE  filter  as  described 
in  Section  IV. 3. 
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(b)  For  N,  >  D  : 
zi 

(1)  Calculate: 


2D-1 


__  JM 

Pr [upper  half]  =  ^  Pr[h^  H^] 


Pr[lovver  half]  =  Pr[h.  i|Bt3 


(2)  If:  Pr [upper  half]  >  [Pr  lower  half] 


Propagate  the  upper  half  of  the  hypothesis 
sequences  to  time  t^  and  update  them  with 
the  observation  at  time  ti 


Otherwise : 


Propagate  and  update  the  lower  half. 


The  best  half  algorithm  is  shown  in  Figure  11  for  the 
case  of  D=2.  The  solid  lines  depict  the  propagation  and 
update  paths  when  the  upper  half  is  the  most  probable  set  of 
hypotheses.  The  dashed  lines  depict  the  propagation  and 
update  paths  when  the  lower  half  is  the  most  probable  set  of 


176 


Figure  11.  Best  Half  Method 


hypotheses.  Note  that  if  the  lower  half  is  chosen  as  most 
probable,  then  the  propagation  and  update  steps  for  that 
observation  are  identical  to  those  of  the  strict  window 
method.  In  essence,  the  best  half  algorithm  makes  a  final 
decision  at  time  ti  as  to  whether  the  event  at  time  ti_c 
was  caused  by  signal  or  noise.  Because  the  algorithm  can 
select  from  either  half  of  the  set  of  hypotheses,  the  entire 

A 

measurement  history  influences  x(ti).  in  addition,  since 
the  algorithm  selectively  retains  half  of  the  possible 
hypotheses,  it  is  not  necessary  to  re-init ialize , 
repropagate,  or  re-update  all  2^  elemental  filter  over  D 


inter-event  intervals  each  time  a  new  event  is  observed. 

Due  to  its  efficiency  of  calculation  and  consideration 
of  the  entire  measurement  history,  the  best  half  method  is 
recommended  as  the  best  means  (for  the  three  methods 
considered)  of  limiting  the  estimator  to  a  fixed  number 
of  elemental  filters. 

V.4  Summary 

Two  major  areas  of  simplification  for  the  MMAE 
algorithm  are  addressed  in  this  chapter.  First,  the 
complexity  of  evaluating  the  weighting  factors  is  reduced, 
in  Section  V.2,  by  approximating  A  with  an  estimate  of 

b 

Ag  based  on  the  previous  observed  point  process  events. 
This  simplification  is  reasonable  due  to  the  assumed  Poisson 
statistics  of  the  signal  process  and  the  assumed  perfect 
knowledge  of  the  dynamics  of  the  underlying  process. 
Second,  the  exponential  growth  in  memory  and  calculation 
required  is  avoided  by  using  data  window  concepts  to  keep 
the  number  of  elemental  filters  fixed.  Three  possible 
methods  of  achieving  a  fixed  number  of  filter  are  discussed 
and  the  "best  half"  algorithm  (Section  V.3.3)  is  recommended 
due  to  its  relative  efficiency  of  calculation  and 
incorporation  of  the  entire  measurement  history. 

Results  of  Monte  Carlo  simulations  of  a  multiple  model 
adaptive  estimator  using  these  simplifications  are  presented 
in  Chapter  VI . 


VI.  Simulation  Results 


VI.  Introduction 

The  results  of  a  Monte  Carlo  simulation  of  the 
simplified  multiple  model  adaptive  estimator  are  presented 
in  this  chapter.  All  of  the  results  are  from  an  estimator 
implemented  using  the  approximation  for  calculation  of 
(Section  V.2)  and  the  Best  Half  method  for  limiting  the 
estimator  to  a  fixed  number  of  elemental  filters  (Section 
V.3.3).  These  simplifications  were  made  to  ease  the  large 
computation  and  storage  requirements  of  the  full-scale 
estimator,  as  discussed  in  Chapter  V. 

The  goal  of  these  simulations  is  to  determine  the 
trends  in  the  error  of  the  estimator  as  several  of  the  major 
parameters  are  varied  and  also  to  investigate  the 
sensitivity  to  these  parameters.  Although  the  actual  values 
used  for  the  parameters  are  appropriate  for  a  tracking  type 
application,  they  are  not  taken  from  an  actual  tracking 
problem.  Therefore,  the  error  performance  results  are 
useful  as  a  relative  measure  of  performance  as  various 
parameters  are  changed  rather  than  as  an  absolute  measure  of 
the  estimator's  performance  for  a  particular  application. 

In  Section  VI. 2,  the  specific  model  used  in  this 


simulation 


is  described.  Section  VI. 3  contains  some 
preliminary  results  including  sample  track  plots  and  the 
sensitivity  of  the  estimator  to  the  number  of  runs  in  the 
Monte  Carlo  simulation.  In  Section  VI. 4.1  through  VI. 4. 6, 
the  sensitivity  of  the  estimator  to  six  major  parameters  is 
shown.  Section  VI. 4. 7  contains  results  of  several 
simulations  to  test  the  ability  of  the  estimator  to  acquire 
the  true  value  given  poor  initial  conditions.  An  initial 
lower  and  upper  bound  on  the  performance  of  the  estimator 
are  given  displayed  in  Section  VI. 4. 8.  Some  other 
simulation  considerations  are  discussed  in  Section  VI. 5 
including  the  effects  of  mismatched  models  and  two  suggested 
methods  of  dealing  with  the  mismatch. 

VI. 2  The  Simulation  Model 

The  model  described  here  is  of  a  one  dimensional 

tracking  application  in  which  there  is  no  feedback  control. 

Let  A  be  defined  as  in  equation  (3)  where 
s 


m  =  1 
A(t)  =  1 
H(t)  =  1 

R(t )  =  R  (scalar) 


(237) 


Let  the  process  x(t)  be  defined  in  differential  form  by 


equation  (4)  where 


x(t)  =  x(t)eRn 
n  =  1 

G(t )  =  g  (scalar) 
F(t)  =  -  | 


(238) 


and  where  u(t)  is  a  one  dimensional  Wiener  Process  of  unit 
diffusion. 

Let  the  photo- electron  event  detector  be  modeled  as  a 
10  centimeter  (cm)  interval  on  the  real  line  centered  about 
zero.  Thus  r  and  x  have  dimensions  of  centimeters  and 
the  dimension  of  R  is  centimeters  squared. 

We  define  the  noise  rate  parameter  as 


Xn(t’r;"s)  " 


X  -5cm  <  r  <  5cm 
n  —  — 


(239) 


elsewhere 


This  form  is  selected  to  model  point  process  noise  events 
caused  by  a  uniform  dark  current  mechanism  in  a  continuous 
detector  over  the  interval  -5cnKr£5cm.  A  simple  random  Xn 
case  is  considered  in  Section  VI. 4.1. 


All  of  the  simulations  are  made  over  a  100  second  time 


I 


interval.  For  this  one  dimensional  case,  the  expected 
number  of  signal  events  is  proportional  to  the  area  under 
the  Gaussian  shaped  function  and  the  expected  number  of 

noise  events  is  proportional  to  the  area  under  the 
function  defined  by  equation  (239).  In  the  simulation  data 
that  follows,  the  signal  to  noise  ration  is  defined  as 


SNR 


A/2TR 

XnL" 


(240) 


where  L  is  the  length  of  the  detector.  The  numerator  of 
equation  (240)  is  the  expected  number  of  signal  point 
process  events  per  unit  time  interval  and  the  denominator  is 
the  expected  number  of  noise  point  process  events  per  unit 
time  interval.  Thus,  the  SNR  for  this  point  process  model 
is  the  ratio  of  the  expected  number  of  signal  events  to  the 
expected  number  of  noise  events. 

VI. 3  General  Results 

Figure  12  shows  the  true  value  of  x  and  the  output  of 
the  multiple  model  adaptive  estimator  for  one  100  second 
simulation  of  the  filter.  The  true  value  of  x  is 
displayed  by  the  solid  line  and  the  broken  line  is  the 
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Figure  12.  Example  of  True  and  Estimated  x(t)  Values  for  One  Run. 


J 


i 


filter's  estimate  of  x  .  The  values  of  the  various 
parameters  are  listed  on  the  figure.  In  all  of  these 
figures,  the  expected  number  of  signal  point  process  events 
is  100  unless  noted  otherwise  (as  in  Section  VI. 4. 2).  This 
sample  run  was  made  by  passing  to  the  filter  the  true 

A 

initial  conditions,  x  =5cm.  The  smoothness  of  x  with 

7  0 

A 

respect  to  x  is  inherent  because  x  is  a  weighted  sum  of 
up  to  eight  elemental  filter  estimates,  and  because  the 
filter  is  given  the  exact  dynamical  model  for  x  . 

Figure  13  shows  the  ensemble  averages  over  50 
simulation  runs  of  x  and  x  for  the  same  set  of 

parameters  and  initial  conditions  displayed  in  Figure  12. 
The  ensemble  averages  of  the  error  statistics,  for  this  50 
run  example,  are  shown  in  Figure  14.  The  solid  line  in 
Figure  14  is  the  ensemble  average  of  the  error,  where  the 

/v 

error  is  defined  as  x-x  The  two  irregular  broken  lines 
are  the  ensemble  averages  of  the  error  plus  or  minus  the 
standard  deviation  of  the  error.  The  two  relatively  smooth 
dashed  lines  are  plus  and  minus  the  square  root  of  the 

filter  variance.  The  filter  variance  is  the  filter's 

estimate  of  how  well  it  is  performing.  Of  importance  here 
is  the  fact  that  the  filter's  estimate  of  its  error  is 
similar  to  its  actual  performance  and  the  filter  variance 
neither  diverges  nor  goes  to  zero. 
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Figure  13.  True  and  Estimated  Ensemble  Averages  for  50Runs 
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The  smoothing  of  the  data,  as  more  simulation  runs  are 
made,  is  typical  of  Monte  Carlo  simulations.  A  pertinent 
question  is,  how  many  runs  are  sufficient  to  give  accurate 
performance  data  without  expending  an  excessive  amount  of 
computation  time.  One  method  of  addressing  this  question  is 
to  vary  the  number  of  simulation  runs  for  a  fixed  set  of 
input  parameters  and  observe  the  error  statistics.  As  more 
runs  are  made,  the  error  statistics  should  converge  to  a 
final  value.  The  results  of  this  analysis  are  shown  in 
Figure  15.  In  this  figure,  the  number  of  runs  is  varied 
from  two  to  100  and  the  root  mean  squared  (RMS)  error  at 
time  t  =  50  seconds  is  plotted.  The  RMS  error  is  chosen  as 
the  measure  of  performance  in  this  (and  subsequent 
sensitivity  tests)  because  it  gives  a  measure  of  the  error 
from  the  true  value  regard'. ess  of  sign.  If  we  used  the 
absolute  error,  a  positive  error  on  one  run  could  cancel  a 
negative  error  on  another  run,  resulting  in  an  incorrectly 
low  ensemble  average.  The  time  for  sampling  the  error  (t-50 
seconds)  is  chosen  to  minimize  the  effect  of  the  filter 
startup  on  the  performance  results.  In  Figure  15,  we  are 
not  interested  in  the  actual  value  of  the  RMS  error. 
Instead,  we  are  looking  for  a  point  beyond  v/hich  there  is 
little  change  in  the  error.  The  value  of  the  error  is 
erratic  for  numbers  of  runs  less  than  20.  For  20  or  more 
runs,  there  is  little  change  in  the  final  RMS  error.  Based 
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on  this,  the  plotted  results  in  all  subsequent  sections  are 
for  Monte  Carlo  simulations  with  ensemble  averaging  over  50 
runs. 

VI. 4  Parameter  Sensitivity  Results 

In  the  following  sections,  the  sensitivity  of  the 
estimator  to  changes  in  several  of  its  major  parameters  is 
investigated.  The  general  method  is  to  vary  one  parameter 
while  keeping  the  rest  constant.  The  performance  measure 
for  evaluation  of  the  sensitivity  is  the  RMS  error  at  time 
t=50  seconds.  (The  acquisition  results  are  measured  at 
t=16  seconds  as  discussed  in  Section  VI. 4. 7.)  In  all  cases, 
the  number  of  Monte  Carlo  runs  is  50.  We  begin  by  looking 
at  the  effect  of  the  noise  strength. 

VI. 4.1  Sensitivity  to  g.  The  parameter  g  is  the 
square  root  of  the  strength  of  the  white  Gaussian  noise 
driving  the  dynamics  of  the  unobserved  process  x  .  The 
lower  curve  in  Figure  16  shows  the  performance  of  the 
estimator  as  g  is  varied  from  0.01  to  1.0  cm.  The  trend 
is  as  expected;  as  the  dynamics  of  x  increase,  the 
ability  of  the  estimator  to  track  the  changes  diminishes  and 
the  RMS  error  increases. 

The  upper  curve  in  Figure  16  shows  the  effect  on  the 
error  of  a  random  X  .  The  value  of  Xn  in  equation  (239) 
is  calculated  from  the  SNR  and  the  expected  number  of  signal 
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igure  16.  Estimator  Performance  Versus  Dynamic  Model  Noise. 
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point  process  events.  For  the  random  An  case,  we  let  A 
be  a  uniform  random  variable  on  [0,2An]  and  a  single 

realization  of  A  "  is  used  for  generation  of  the  noise 

point  process  events  for  a  single  run  of  the  random  Aq 
simulation.  Note  that  the  realization  is  used  for  the 
entire  run  and  a  new  value  is  selected  for  the  next  run. 

A 

The  estimator  uses  the  value  An  for  calculation  of  An- 
This  procedure  is  chosen  as  a  worst  case  condition;  very 
little  knowledge  is  assumed  by  the  filter  about  the  A^ 
process  and  no  actual  estimation  of  An  is  performed. 

As  can  be  seen  from  Figure  16,  the  RMS  error  increased 
for  the  random  noise  rate  parameter  case,  but  the  increase 
is  very  small  compared  to  the  actual  value  of  the  error. 
For  this  set  of  parameters,  the  estimator  is  relatively 
insensitive  to  uncertainties  in  Ar  . 

VI. 4. 2  Sensitivity  to  Expected  Number  of  Signal 
Events .  In  Figure  17,  the  RMS  error  is  plotted  versus  the 
expected  number  of  point  process  signal  events.  As  might  be 
expected,  as  more  information  is  available  to  the  estimator 
from  the  signal  process,  the  RMS  error  goes  down.  The  trend 
is  actually  rather  mild  from  signal  counts  ranging  from  one 
to  500.  If  the  estimator  only  receives  one  signal-caused 
event  in  100  seconds,  then  it  must  rely  heavily  on  its 
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Figure  17.  Estimator  Performance  versus  Expected  Number  of  Signal  Events, 


internal  model  for  the  dynamics  of  the  x  process.  As 
described  before,  we  have  assumed  perfect  knowledge  of  the 
model.  It  is  expected  that  the  RMS  error  would  be  much 
larger  at  the  low  count  rates  if  the  dynamical  models  of  the 
true  process  and  the  filter  were  mismatched. 

VI. 4. 3  Sensitivity  to  t.  The  sensitivity  of  the  RMS 
error  to  x  is  depicted  in  Figure  18.  As  can  be  seen, 
there  is  a  strong  trend  toward  increased  RMS  error  as  t 
becomes  larger.  When  t  is  large,  the  dynamics  of  x 
depend  proportionately  more  on  the  driving  noise  source  and 
there  is  less  restoring  action  due  to  the  F(t)x(t)dt  term 

A 

in  equation  (4).  This  allows  errors  between  x  and  x  to 
persist  for  longer  periods  between  signal  induced  point 
process  observation.  When  x  is  small,  any  errors  caused 
by  the  driving  noise  in  equation  (4)  are  rapidly  reduced  as 
the  output  decays  to  the  steady  state  value. 

VI. 4. 4  Sensitivity  to  SNR.  Several  SNR  sensitivity 
tests  were  made  for  parameter  sets  similar  to  those  shown  in 
Figures  16  through  18.  In  all  cases,  there  was  virtually  no 
effect  on  the  RMS  error,  even  for  SNR  values  as  low  as  0.01. 

The  SNR  sensitivity  results  shown  in  Figure  19  were 
obtained  by  setting  x  and  R  to  relatively  large  values 
(thus  tending  to  raise  the  overall  RMS  error,  see  Figures  18 
and  21)  and  by  giving  the  multiple  model  adaptive  estimator 
poor  initial  conditions.  The  trend  displayed  in  Figure  19 


Figure  IS.  Estimator  Pe 


Figure  19.  Estimator  Performance  versus  Signal  to  Noise  Count  Ratio 


is  as  expected;  the  RMS  error  is  lower  for  larger  values  of 
SNR.  Of  particular  interest  is  the  fact  that  the  worst  RMS 
error  is  only  0.75  cm  for  an  expected  signal  to  noise  count 
ratio  of  0.1  .  That  is,  one  signal  event  is  expected  for 
every  10  noise  events.  It  is  expected  that  the  RMS  error 
would  increase  more  rapidly  at  low  SNR  values  if  the  true 
and  filter  models  were  mismatched. 

VI. 4. 5  Sensitivity  to  D.  The  sensitivity  to  depth, 
D  ,  is  shown  in  Figure  20.  Note  that  the  error  axis  scale 
on  this  plot  is  expanded  and  the  variation  in  RMS  error  over 
a  depth  range  of  one  to  eight  is  very  small.  This  suggests 
that  if  the  model  of  the  underlying  process  is  well  known 
then  it  may  be  possible  to  consider  only  the  most  recent 
observed  event  and  obtain  acceptable  performance. 

VI. 4. 6  Sensitivity  to  R.  The  sensitivity  to  R  (the 
dispersion  of  the  Gaussian  shaped  X  function)  is 
displayed  in  Figure  21.  The  performance  indicates  that 
there  are  specific  tuning  considerations  for  R  for  a  given 
set  of  parameter  values.  As  R  becomes  very  small,  it 
appears  that  valid  signal  events  are  deweighted  too  heavily 
due  to  the  narrow  shape  of  Xg.  At  the  other  extreme,  when 
R  is  large,  noise  induced  events  near  the  signal  source  are 
not  deweighted  strongly  enough.  This  characteristic  curve, 
as  R  is  varied,  can  also  be  seen  in  Figure  24. 


Figure  21.  Estimat 


VI. 4. 7  Acquisition.  A  final  set  of  parameters  was 
selected  to  test  the  ability  of  the  estimator  to  acquire  the 
true  x  value  given  inaccurate  initial  conditions.  In  all 

A 

of  the  following  cases,  xfl=5  and  xq=0  • 

A 

Figure  22  shows  the  ensemble  average  of  x  and  x  for 

A 

the  relatively  small  values  of  R  =0.2  and  £0  =  * 

The  performance  is  very  poor  until  the  true  x  value  decays 

A 

to  a  region  close  to  x.  The  acquisition  can  be  greatly 

A 

improved  by  setting  R=2  and  E0=10.  These  results  are 
shown  in  Figure  23.  Under  the  new  conditions,  the  estimator 
quickly  acquires  the  true  x  value  and  tracks  it. 

The  RMS  error  performance  versus  R  for  these  two 

A 

values  of  £o  is  displayed  in  Figure  24.  Note  that  the 
values  plotted  are  for  t=16  seconds  to  insure  that  error  in 
the  acquisition  region  is  being  measured.  Both  curves  show 
the  characteristic  tuning  sensitivity  to  R  as  in  Figure 
21.  The  upper  curve  is  the  performance  when  the  multiple 
model  adaptive  estimator  has  a  high  confidence  in  its 
initial  conditions.  The  lower  curve  is  the  performance  for 
the  low  confidence  case. 

VI. 4. 8  Performance  Bounds.  The  estimator's  RMS  error 
versus  square  root  of  noise  strength  is  plotted  in  Figure  25 
along  with  an  upper  and  lower  bound  on  the  RMS  error. 

The  lower  bound  on  peiformance  was  obtained  by 
operating  the  estimator  with  only  signal  observed  events, 
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Figure  23.  Successful  Acquisition 


figure  24.  Estimator  Acquisition  Performance  versus 


i.eure  25.  Performance  Bounds 


corresponding  to  the  case  in  which  the  adaptive  estimator 
perfectly  deciphers  which  events  are  due  to  signal  and  which 
are  due  to  noise.  This  was  accomplished  by  setting  the 
SNR  =  106  in  the  input  parameters  to  the  simulation. 
Thus,  there  was  almost  never  a  noise  event  and  the  RMS  error 
displayed  in  the  lower  curve  is  the  best  possible  for  the 
simulation  parameter  set  shown. 

An  upper  bound  on  performance  was  obtained  by  operating 
a  single  Snyder-Fishman  filter  against  the  same  noisy  input 
data  for  the  same  filter  parameter  set.  Recall  that  the 
Snyder-Fishman  filter  accepts  each  observation  as  having 
been  caused  by  the  signal  process. 

As  can  be  seen  in  Figure  25,  the  RMS  error  of  the 
multiple  model  adaptive  estimator  (with  the  simplifications 
described  in  Chapter  V)  matches  the  lower  bound  on  the  error 
for  values  of  g^O.l  (square  root  of  noise  strength). 
Above  this  value,  the  RMS  error  of  the  simplified  estimator 
is  greater  than  the  lower  bound,  but  in  all  cases  it  is 
better  than  the  performance  of  the  Snyder-Fishman  filter.  A 
figure  of  merit  which  takes  into  consideration  F(t),  G(t), 
and  Q(t)  is  the  square  root  of  the  steady  state  covariance 
(SS/RMS)  of  x(t).  For  this  one  dimensional  simulation 
example,  the  RMS  value  of  the  steady  state  variance  is 
/xgz/2  .  in  terms  of  the  SS/RMS,  the  RMS  error  of  the 
multiple  model  adaptive  estimator  matches  the  lower  bound 
for  SS/RMS  <_  0.3162,  and  the  error  is  greater  than  the 
lower  bound  for  larger  values  of  SS/RMS  .  For  the  parameter 


set  shown  in  Figure  25,  SS/RMS  =  0.3162  for  g  =  0.1  and 
SS/RMS  =  0.G324  for  g  =  0.2  .  The  significance  of  this  is 
that  as  the  SS/RMS  value  becomes  larger  than  the  "disper¬ 
sion",  R,  of  the  signal  rate  parameter,  the  multiple  model 
adaptive  estimator's  error  is  greater  than  the  lower  bound. 
For  values  of  SS/RMS  less  than  R  ,  the  estimator  essential¬ 
ly  makes  perfect  decisions  as  to  which  events  were  caused  by 
signal  and  which  were  caused  by  noise. 

Note  that  in  the  MMAE  curve  and  the  Snyder-Fishman 
curve,  the  expected  signal  to  noise  count  ratio  (SNR)  is 
one;  there  is  one  expected  noise  event  for  each  signal 
event . 

VI. o  Other  Simulation  Considerations 

As  mentioned  before,  all  of  the  simulation  results 
shown  in  this  chapter  are  based  on  knowledge  of  the  true 
dynamical  model  of  x(t)  Although  the  performance  levels 
will  change,  in  general,  with  mismatched  models,  the  trends 
displayed  in  these  simulations  should  still  be  evident. 

A  model  mismatch  could  easily  arise  for  several  rea¬ 
sons.  For  example,  we  may  not  know  exactly  how  to  model  the 
dynamics  of  x(t)  or  we  may  wish  to  approximate  a  known 
model  with  a  Simpler  model  to  reduce  the  computational  load 
of  the  elemental  filters. 

One  method  to  compensate  for  a  mismatch  is  to  add 
pseudonoise  to  the  elemental  filter  models  (Ref.  27  vol. 
1:224).  This  is  a  common  technique  in  Kalman  filter  tuning 
and  has  the  effect  of  reducing  the  filter's  confidence  in 
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its  own  model.  A  second  technique  which  may  be  useful  is  to 
impose  an  artificial  lower  bound  on  the  elemental  weighting 
factors  (Refs.  2,27  vol.2).  This  would  have  the  effect  of 
putting  more  confidence  in  the  less  likely  hypotheses;  the 
overall  estimate  would  be  more  heavily  influenced  by  the 
elemental  filter  outputs  associated  with  these  less  likely 
hypotheses.  In  general,  this  will  tend  to  increase  the 
error  over  that  obtained  when  the  model  is  known  exactly; 
however,  it  may  prevent  a  catastrophic  failure  in  which  the 
estimator  "locks"  onto  a  completely  incorrect  hypothesis 
model . 

VI. 6  Summary 

In  this  chapter,  a  one  dimensional  model  is  specified 
for  a  tracking  application  in  which  there  is  no  feedback 
control.  Simulation  results  are  presented  based  on  the  MMAK 
simplifications  of  Chapter  V.  The  RMS  error  performance 
sensitivities  to  the  major  parameters  of  the  estimator  are 
presented  and  discussed. 

The  results  are  indicative  of  the  overall  RMS  error 
performance  of  the  multiple  model  adaptive  estimator  and  of 
the  performance  trends  as  various  parameters  are  varied. 
This  simplified  filter  implementation  shows  excellent  acqui¬ 
sition  and  tracking  properties  when  the  various  filter  pa¬ 
rameters  are  tuned  to  the  underlying  process  of  interest. 
This  is  particularly  notable  considering  the  low  data  rate 
of  the  signal  and  the  low  expected  signal  to  noise  count 
ratios  expected. 
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VII.  Conclusions  and  Recommendations 


VI  1. 1  Conclusions 

The  goal  of  this  research  was  to  develop  an  estimator 
structure  for  the  particle  beam  problem  which  is  optimum, 
according  to  some  appropriate  criterion,  and  which  is 
insensitive  to  point  process  noise  corruption  in  the 
measurements.  The  multiple  model  adaptive  estimator, 
presented  in  Chapter'Ll,  provides  the  minimum  mean  squared 
error  estimate  of  the  underlying  process  of  interest.  When 
the  models  are  defined  as  separate  sequence  hypotheses,  the 
estimator  can  reduce  the  effect  of  noise  on  the  estimate. 
The  development  is  valid  for  any  point  process  signal  in 
point  process  noise;  it  does  not  rely  on  the  assumed 
conditionally  Poisson  statistics  of  the  particle  beam 
application  which  motivated  this  research.  The  full  scale 
estimator  does  require  an  exponentially  growing  number  of 
individual  hypothesis  filters. 

The  cross  product  space  modeling  concepts  of  Chapter 
III  provide  a  means  of  calculating  the  individual  filter 
weighting  factors.  This  development  is  also  valid  for  a 
general  point  process  signal  in  point  process  noise 
application  as  long  as  the  regularity  conditions  are  met. 
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The  specific  analytical  model  for  the  particle  beam 
application  meets  the  regularity  conditions.  The  cross 
product  space  modeling  concepts  allow  feedback  from  the 
observations  to  the  model,  thus  providing  a  way  to  define 
control  inputs. 

The  specific  expressions  for  the  estimator  suitable  for 
the  particle  beam  problem  are  developed  in  Chapter  IV.  The 
examples  presented  provide  insight  into  the  structure  of  the 
full  scale  estimator. 

The  simplifications  to  the  full  scale  estimator 
presented  in  Chapter  V  result  in  a  subopt inal  filter  which 
has  greatly  reduced  requirements  for  calculation  and 
storage.  The  approximation  for  the  signal  rate  parameter 
estimate  is  appropriate  for  the  particle  beam  problem  when 
the  model  of  the  underlying  process  is  known.  The  use  of 
data  windowing  to  stop  the  growth  in  the  number  of  elemental 
hypothesis  filters  is  applicable  to  the  general  point 
process  multiple  model  adaptive  estimator.  Three  methods 
are  proposed  to  implement  the  data  window  concept.  The 
"Best  Half”  method  is  recommended  because  it  limits  the 
growth,  it  implicitly  includes  information  from  the  entire 
measurement  history,  and  it  is  relatively  simple  to 
implement . 

The  simulation  results  of  Chapter  VI  show  that  the 
suboptimal  filter  (using  the  simplifications  developed  in 
Chapter  V)  is  extremely  good  at  reducing  the  error  caused  by 
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point  process  noise.  Performance  trends  are  demonstrated  as 


several  of  the  parameters  of  the  filter  are  varied.  The 
performance  degrades  slowly,  even  at  very  low  signal  count 
rates  and  very  low  signal  to  noise  count  ratios.  It  is 
expected  that  the  performance  would  degrade  more  rapidly  if 
the  estimator  had  inaccurate  knowledge  of  the  dynamics  of 
the  underlying  process. 

VI I . 2 _  Recommendat ions 

There  are  several  related  areas  of  research  which  could 
provide  immediately  useful  results.  First,  it  is 
recommended  that  a  method  of  predicting  the  error  of  the 
multiple  model  adaptive  estimator  be  developed.  Equation 
I'  (193)  provides  the  means  of  calculating  the  covariance  of 

the  full-scale  multiple  model  adaptive  estimator;  however, 
the  filter  covariances  of  the  elemental  Snyder-Fishman 
filters  depend  on  the  space-time  observations,  as  do  the 
calculations  of  the  weighting  factors.  Both  of  these  are 
required  for  calculation  of  the  overall  covariance  of  the 
multiple  model  adaptive  estimator.  Currently,  only 
simulation  techniques  provide  a  means  of  determining 

performance . 

The  second  area  of  recommended  research  is  in  the 
convergence  of  the  estimator.  The  key  question  to  be 
determined  here  is  whether  the  full  scale  estimator 
converges  to  the  correct  hypothesis  sequence,  and  how  rapid 


the  convergence  is.  The  information  theoretic  ideas  of 
Baram  (Ref.  6),  and  Hawkes  and  Moore  (Refs.  15,16)  appear  to 
provide  promising  avenues  of  research. 

Another  area  which  could  provide  immediately  useful 
results  is  in  the  definition  of  the  stochastic  optimal 
controller  for  this  point  process  problem.  The  cross 
product  space  modeling  ideas  allow  the  necessary  feedback 
control;  however,  the  specific  optimum  form  of  the  control 
is  not  addressed  in  this  dissertation.  A  natural  follow-on 
topic  is  the  investigation  of  a  separation  theorem  for  the 
optimum  controller,  or  perhaps  the  effects  of  forced 
certainty  equivalence  (Ref.  27  vol.  111:17). 

The  final  research  suggested  is  to  explore  the  effects 
of  imperfect  knowledge  of  the  dynamics  of  the  underlying 
process  (model  mismatch).  As  described  in  Chapter  VI,  the 
estimator  simulations  were  implemented  with  perfect 
knowledge  of  the  dynamical  model.  A  mismatch  would 
certainly  degrade  the  performance  of  the  estimator.  The 
extent  of  the  degradation  is  of  great  importance  in  cases 
where  a  simplified  model  is  desired  due  to  the  complexity  of 
the  true  model,  or  where  there  is  a  lack  of  accurate 
knowledge  about  the  true  model. 
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Appendix  A 


In  Chapter  III,  a  cross  product  space  model  is 
developed  and  regularity  results  from  Fishman  (Ref.  11)  are 
used  to  provide  the  means  of  calculating  the  weighting- 
factors  for  the  multiple  model  adaptive  estimator.  As 
described  pi'eviously,  this  approach  results  in  calculation 
of  the  elemental  estimates  through  the  use  of  the  Snyder- 
Fishman  filter  and  a  simple  weighted  sum  to  calculate  the 
overall  estimate.  The  penalty  of  this  method  is  the 
exponential  growth  of  memory  and  calculation  time 
requirements  for  the  full-scale  estimator. 

The  regularity  results  of  Chapter  III,  along  with  an 
Ito  diffusion  differential  rule,  allow  direct  estimation 
of  the  process  x(t).  By  "direct  estimate,"  we  mean  an 

A 

expression  for  x(t)  (perhaps  in  differential  form)  which 
does  not  use  multiple  model  adaptive  methods.  The  advantage 
of  this  is  that  there  is  no  exponential  growth  in  memory  or 
calculation  time  requirements  due  to  an  expanding  number  of 
hypotheses,  as  in  the  multiple  model  estimator.  The 
disadvantage  to  the  direct  method  is  that  it  requires 
evaluation  of  integrals  which  (depending  on  the  specific 
system  model)  can  be  much  more  complicated  than  those 
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required  by  the  multiple  model  method. 

We  begin  the  development  by  letting  V  be  a  regular 

space-time  point  process 


V:[to,t)XR-*[t0,t)XRm 

which  satisfies  Theorem  III-l.  Let  (fl ,  8  ,P(  •  ;cog) )  be  a 

probability  space  as  defined  in  Chapter  III  and  let 

/o  a  P  )  be  a  probability  space  defined  as  the  cross 
^  s’  s’  s ’ 

product  of  two  individual  probability  spaces 
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where  ^S2  is  as  defined  in  Chapter  III  (equation  99)  and 
models  the  randomness  of  the  noise  process  and  nSl  models 
the  randomness  of  the  signal  process.  Let  ^st  be  the 
subsigma  field  of  events  generated  for  te[t0,t)  . 

For  the  beam  pointing  and  tracking  problem,  we  are 
interested  in  estimating  the  process  x( t  ;o> ;^Si )  given 
observations  of  the  point  process  v,  that  is,  in  generating 


(A-l) 


x(t)  -  E{x(t  ;u;toSi  )  |  Bt> 

where  the  signal  rate  parameter  is  dependent  on  x(t)  as 
in  equation  3. 

We  model  the  process  of  interest  as  the  It6  diffusion 
process 

dx(t ;  oj  ;  uj  )  =  a(t;w;<jj  )dt+b(  t ;  u) ;  w  )du(t;w  )  (A-2) 

Sl  Sl  Si  Si 

K) 

where  u  is  a  k  dimensional  Wiener  process  and 
"a,b,  and  u  are  A  ®S.  measurable  processes.  This  is  a 

”  fa  1/  L 

more  general  model  than  that  given  in  equation  4,  and  we 
assume  that  a  unique  solution  to  equation  A-2  exists. 

For  this  It<5  model,  Fishman  (Ref.  11)  has  shown  the 
following  differential  rule.  For  a  proof,  see  reference 
11:170-174. 
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Theorem  A-l,  Generalized  Ito  Differential  rule.  Let 

d£(t ; a)  ;ojs)  =  a(t  ;w;ws)dt+b(t  ;w;tos)du(t  ;u!s) 

+^’Y(t>^;w;u)s)N(dtXd4;aj;(Js)  (A-3) 

Y 

YcRn 

where  a,b,  and  u  are  as  in  equation  A-2,  and  y  is  an  m 
dimensional  Ast®Bt  —  measurable  process  which  is  left 
continuous  in  t  and  continuous  in  r  If  iKt,£(t))  has 
a  continuous  derivative  in  t  and  continuous  partial 
derivatives  of  second  order  in  the  components 
of  X  for  te [t o ,T) , £eRm 


then  (w.p.l) 


d<Kt,?(t))  =  ||  dt  +  <  ,  a(t)  >  dt 

a? 


bbTi^jdt  +  j^l|jTbdu(t)  <A~4> 

C(t)+Y(t,*))-«(t,  ?(t))]N(dtXd*) 

Y 

The  angle  brackets  in  equation  A-4  denote  the  vector  inner 
product . ' 

□ 

In  order  to  use  this  differential  rule  to  obtain  the 

A 

estimate  x(t)  of  the  process  x(t)^  let  x(t)  be 
defined  as  in  equation  A-2  and  let 
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Xi  (t) 


(  A-6) 


C(t)  = 


x(t) 


n(t) 


Xn(t) 


n(t) 


With  this  definition  for  C(t)  we  can  now  write  equation 
A-3  as 


d  C(t  ;ui  ;o)  )  =  a'(t;ui;w  )dt  +  b'(t;w;w  )du(t;io  ) 


(  A-7) 


+  /  Y"(t,/i;w;w  )N(  dtX  dA. ;  o>  ) 
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and  where  the  arguments  on  the  hazard  functions  have  been 
dropped  for  ease  of  notation. 

For  i  =  1,2,  ...  ,n  we  define 


In  vector  form,  the  differential  representation  of  x(t)  is 


dx(t)  =  a(t)dt- 


L  y 


A.  A 


x(t)cf)-x(t)<J) 


dJi 


dt 


n 


+  (  jx(t)<!>-x(t)<J>}  1  N( dt Xcl^t ) 

Y 


(A-16) 


x(t  o )  =  E{x(t  o ) } 


The  conditional  error  covariance  matrix 


E 

”t 


A 


E J[x(t )-x(t)][x(t)-x(t )]  j  B 


-  x(t)xT(t) 


a  a  m 

x(t)x  (t) 


( A-17 ) 


can  be  expressed  in  differential  form  (Ref.  11:178)  as 


r  SL  —  .C.T  t! 

=  (a  xl  -a  xi)+(x  a1  -  x  a  )+  b  b  Jdt 


“j 


+  y*E|(x-x)(x-x)T((J)-i)  |Btji“lN(dt  X  d*) 


-/e  {<*- 


x)(^-4>)  |  B 


•  E  l(x-x)T(<j)-J)|Bt[i  ZN(dt  X  cl*) 


( A-18 ) 


Equations  A-16  and  A-18  give  the  differential 
representation  of  the  quantity  to  be  estimated  (directly) 
and  the  error  covariance,  respectively.  We  can  now  cast  the 
particle  beam  model  in  terms  of  equation  A-2  to  obtain  the 
direct  estimator  equations  for  this  application. 

As  in  Chapter  III,  the  model  for  the  process  x(t)  is 
given  by 
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dx(t;a);u)  )  =  F(  t  )x(  t  ;to ;  to  )dt  +  G(t)du(t;w  ) 
S  S  s 


(A- 19) 

x(to)  =  X0  to  -  t 

and  the  hazard  function  for  the  observed  point  process  is 


<Kt,r;u;us)  =  Ag(t ,  r  ;to  ;wg)  +  An(t ,  r;u;wg)  (A-20) 

where  the  signal  rate  parameter  is  given  by  equation  103. 
The  noise  rate  parameter  is  left  general  in  form;  we  only 
require  that  the  observed  point  process  satisfy  Theorem  III- 
1  (it  is  regular  ) . 

By  comparing  our  particle  beam  problem  model  (equation 
A-19)  and  the  general  Ito  model  (equation  A-2),  we  can 
express  the  Ito  model  terms  as 
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where 


x(t)  -  £•  x(t;u>;wSi)|6t  (A-24) 


<t>  =  Ejxs(t,r;w;a)Si  )  +  *n(t  ^  r  ;u>  jto^  )  |  Bt  (A-25) 


^er  ^  a  \-  — 

x(t)<J>  =  Ejx(t;u;wSj)<{)(t,r;co;ajs)  |  Bt 

Conditional  expected  values  of  the  form  of  equations 
(A-23)  through  (A-26)  are  difficult  to  solve.  One  approach 
is  to  assume  a  density  for  the  quantity  in  question  and  then 
solve  for  the  moments  (or  a  partial  set  of  the  moments) 
(Ref.  27  Vol.II).  The  complexity  of  solving  for 
either  directly  or  through  the  use  of  an  assumed  density 
motivates  the  multiple  model  adaptive  estimator  approach 
taken  in  Chapter  II. 


(A-26) 
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